C# - Interesting String Methods
Introduction
In this post, we will look at some interesting string methods in C#. For example:
- You’ll the
IndexOf()
method to locate the position of one or more characters string inside a larger string. - You use the
Substring()
method to return the part of the larger string that follows the character positions you specify. - You’ll also use an overloaded version of the
Substring()
method to set the length of characters to return after a specified position in a string.
Examples:
Example 1: Using the IndexOf()
Method to find parenthesis pairs embedded in a string
1
2
3
4
5
6
7
string message = "Find what is (inside the parentheses)";
int openingPosition = message.IndexOf('(');
int closingPosition = message.IndexOf(')');
Console.WriteLine(openingPosition);
Console.WriteLine(closingPosition);
- Output:
1
2
13
36
Example 2: Using the Substring()
Method to retrieve the value between parenthesis
1
2
3
4
5
6
7
8
9
string message = "Find what is (inside the parentheses)";
int openingPosition = message.IndexOf('(');
int closingPosition = message.IndexOf(')');
openingPosition += 1;
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
- Output:
1
2
inside the parentheses
- The
Substring()
method needs the starting position and the number of characters, or length, to retrieve. - So, you calculate the length in a temporary variable called length, and pass it with the openingPosition value to retrieve the string inside of the parenthesis.
- To remove the parenthesis from output, you have to update the code to skip the index of the parenthesis itself by adding 1 to the
openingPosition
value. - The reason you’re using the value 1 is because that is the length of the character. If you attempt to locate a value starting after a longer string, for example,
<div>
or---
, you would use the length of that string instead.
1
2
3
4
5
6
7
8
string message = "What is the value <span>between the tags</span>?";
int openingPosition = message.IndexOf("<span>");
int closingPosition = message.IndexOf("</span>");
openingPosition += 6;
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
Avoid magic values
- Hardcoded strings like
"<span>"
in the previous code listing are known as"magic strings"
and- hardcoded numeric values like 6 are known as
"magic numbers"
.- These
"Magic" values
are undesirable for many reasons and you should try to avoid them if possible.
- Review the previous code to consider how the code might break if you hardcoded the string
"<span>"
multiple times in your code, but misspelled one instance of it as"<sapn>"
. - The compiler doesn’t catch
"<sapn>"
at compile time because the value is in a string. - The misspelling leads to problems at run time, and depending on the complexity of your code, it might be difficult to track down.
- Furthermore, if you change the string
"<span>"
to the shorter"<div>"
, but forget to change the number 6 to 5, then your code produces undesirable results.
1
2
3
4
5
6
7
8
9
10
11
string message = "What is the value <span>between the tags</span>?";
const string openSpan = "<span>";
const string closeSpan = "</span>";
int openingPosition = message.IndexOf(openSpan);
int closingPosition = message.IndexOf(closeSpan);
openingPosition += openSpan.Length;
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
- Take a minute to examine the updated code and the use of the keyword const as used in
const string openSpan = "<span>";
. - The code uses a constant with the
const
keyword. - A constant allows you to define and initialize a variable whose value can never be changed.
- You would then use that constant in the rest of the code whenever you needed that value.
- This ensures that the value is only defined once and misspelling the const variable is caught by the compiler.
- Now, if the value of openSpan changes to
<div>
, the line of code that uses the Length property continues to be valid.
Example 3: Using the LastIndexOf()
Method to retrieve the last occurrence of a sub string
- You increase the complexity of the message variable by adding many sets of parentheses, then write code to retrieve the content inside the last set of parentheses.
1
2
3
4
5
6
7
string message = "(What if) I am (only interested) in the last (set of parentheses)?";
int openingPosition = message.LastIndexOf('(');
openingPosition += 1;
int closingPosition = message.LastIndexOf(')');
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
- Output:
1
set of parentheses
- The key to this example is the use of LastIndexOf(), which you use to get the positions of the last opening and closing parentheses.
Example 4: Using the Substring()
Method to retrieve all instances of substrings inside parentheses
- add a while statement to iterate through the string until all sets of parentheses are discovered, extracted, and displayed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
string message = "(What if) there are (more than) one (set of parentheses)?";
while (true)
{
int openingPosition = message.IndexOf('(');
if (openingPosition == -1) break;
openingPosition += 1;
int closingPosition = message.IndexOf(')');
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
// Note the overload of the Substring to return only the remaining
// unprocessed message:
message = message.Substring(closingPosition + 1);
}
- Output:
1
2
3
What if
more than
set of parentheses
- When you use
Substring()
without specifying a length input parameter, it will return every character after the starting position you specify.- With the string being processed,
message = "(What if) there are (more than) one (set of parentheses)?"
, there’s an advantage to removing the first set of parentheses (What if) from the value of message.- What remains is then processed in the next iteration of the while loop.
- The IndexOf() method returns
-1
if it can’t find the input parameter in the string. - You merely check for the value
-1
and break out of the loop.
Example 5: Using the IndexOfAny()
Method to work with different types of symbol sets
- Update the message string, adding different types of symbols like square
[]
brackets and curly braces{}
. - To search for multiple symbols simultaneously, use
IndexOfAny()
. - You search with
IndexOfAny()
to return the index of the first symbol from the array openSymbols found in the message string.
1
2
3
4
5
6
7
8
9
string message = "Help (find) the {opening symbols}";
Console.WriteLine($"Searching THIS Message: {message}");
char[] openSymbols = { '[', '{', '(' };
int startPosition = 5;
int openingPosition = message.IndexOfAny(openSymbols);
Console.WriteLine($"Found WITHOUT using startPosition: {message.Substring(openingPosition)}");
openingPosition = message.IndexOfAny(openSymbols, startPosition);
Console.WriteLine($"Found WITH using startPosition {startPosition}: {message.Substring(openingPosition)}");
- output:
1
2
3
Searching THIS message: Help (find) the {opening symbols}
Found WITHOUT using startPosition: (find) the {opening symbols}
Found WITH using startPosition 5: (find) the {opening symbols}
- You used
IndexOfAny()
without, and then with, the starting position overload. - Now that you found an opening symbol, you need to find its matching closing symbol.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
string message = "(What if) I have [different symbols] but every {open symbol} needs a [matching closing symbol]?";
// The IndexOfAny() helper method requires a char array of characters.
// You want to look for:
char[] openSymbols = { '[', '{', '(' };
// You'll use a slightly different technique for iterating through
// the characters in the string. This time, use the closing
// position of the previous iteration as the starting index for the
//next open symbol. So, you need to initialize the closingPosition
// variable to zero:
int closingPosition = 0;
while (true)
{
int openingPosition = message.IndexOfAny(openSymbols, closingPosition);
if (openingPosition == -1) break;
string currentSymbol = message.Substring(openingPosition, 1);
// Now find the matching closing symbol
char matchingSymbol = ' ';
switch (currentSymbol)
{
case "[":
matchingSymbol = ']';
break;
case "{":
matchingSymbol = '}';
break;
case "(":
matchingSymbol = ')';
break;
}
// To find the closingPosition, use an overload of the IndexOf method to specify
// that the search for the matchingSymbol should start at the openingPosition in the string.
openingPosition += 1;
closingPosition = message.IndexOf(matchingSymbol, openingPosition);
// Finally, use the techniques you've already learned to display the sub-string:
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
}
- Output:
1
2
3
4
What if
different symbols
open symbol
matching closing symbol
Example 6: Using the Remove()
to remove characters in specific locations from a string
- Consider the following code:
1
2
3
string data = "12345John Smith 5000 3 ";
string updatedData = data.Remove(5, 20);
Console.WriteLine(updatedData);
- output
1
123455000 3
- The
Remove()
method works similarly to theSubstring()
method. - You supply a starting position and the length to remove those characters from the string.
Example 7: Using the Replace()
method to remove characters no matter where they appear in a string
- The Replace() method is used when you need to replace one or more characters with a different character (or no character).
- The Replace() method is different from the other methods used so far, it replaces every instance of the given characters, not just the first or last instance.
- For example:
1
2
3
4
string message = "This--is--ex-amp-le--da-ta";
message = message.Replace("--", " ");
message = message.Replace("-", "");
Console.WriteLine(message);
- Output:
1
This is example data
Example 8: Extract, replace, and remove data from an input string
In this example, you work with a string that contains a fragment of HTML. You extract data from the HTML fragment, replace some of its content, and remove other parts of its content to achieve the desired output.
Starter code:
1
2
3
4
5
6
7
8
9
const string input = "<div><h2>Widgets ™</h2><span>5000</span></div>";
string quantity = "";
string output = "";
// Your work here
Console.WriteLine(quantity);
Console.WriteLine(output);
- Expected output:
1
2
Quantity: 5000
Output: <h2>Widgets ®</h2><span>5000</span>
- Solution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
const string input = "<div><h2>Widgets ™</h2><span>5000</span></div>";
string quantity = "";
string output = "";
// Your work here
// Extract the "quantity"
const string openSpan = "<span>";
const string closeSpan = "</span>";
int quantityStart = input.IndexOf(openSpan) + openSpan.Length; // + length of <span> so index at end of <span> tag
int quantityEnd= input.IndexOf(closeSpan);
int quantityLength = quantityEnd - quantityStart;
quantity = input.Substring(quantityStart, quantityLength);
quantity = $"Quantity: {quantity}";
// Set output to input, replacing the trademark symbol with the registered trademark symbol
const string tradeSymbol = "™";
const string regSymbol = "®";
output = input.Replace(tradeSymbol, regSymbol);
// Remove the opening <div> tag
const string openDiv = "<div>";
int divStart = output.IndexOf(openDiv);
output = output.Remove(divStart, openDiv.Length);
// Remove the closing </div> tag and add "Output:" to the beginning
const string closeDiv = "</div>";
int divCloseStart = output.IndexOf(closeDiv);
output = "Output: " + output.Remove(divCloseStart, closeDiv.Length);
Console.WriteLine(quantity);
Console.WriteLine(output);
This post is licensed under CC BY 4.0 by the author.