Post

C# - Interesting String Methods

Introduction

In this post, we will look at some interesting string methods in C#. For example:

  • You’ll the IndexOf() method to locate the position of one or more characters string inside a larger string.
  • You use the Substring() method to return the part of the larger string that follows the character positions you specify.
  • You’ll also use an overloaded version of the Substring() method to set the length of characters to return after a specified position in a string.

Examples:

Example 1: Using the IndexOf() Method to find parenthesis pairs embedded in a string

1
2
3
4
5
6
7
string message = "Find what is (inside the parentheses)";

int openingPosition = message.IndexOf('(');
int closingPosition = message.IndexOf(')');

Console.WriteLine(openingPosition);
Console.WriteLine(closingPosition);
  • Output:
1
2
13
36

Example 2: Using the Substring() Method to retrieve the value between parenthesis

1
2
3
4
5
6
7
8
9
string message = "Find what is (inside the parentheses)";

int openingPosition = message.IndexOf('(');
int closingPosition = message.IndexOf(')');

openingPosition += 1;

int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
  • Output:
1
2
inside the parentheses

  • The Substring() method needs the starting position and the number of characters, or length, to retrieve.
  • So, you calculate the length in a temporary variable called length, and pass it with the openingPosition value to retrieve the string inside of the parenthesis.
  • To remove the parenthesis from output, you have to update the code to skip the index of the parenthesis itself by adding 1 to the openingPosition value.
  • The reason you’re using the value 1 is because that is the length of the character. If you attempt to locate a value starting after a longer string, for example, <div> or ---, you would use the length of that string instead.
1
2
3
4
5
6
7
8
string message = "What is the value <span>between the tags</span>?";

int openingPosition = message.IndexOf("<span>");
int closingPosition = message.IndexOf("</span>");

openingPosition += 6;
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));

Avoid magic values

  • Hardcoded strings like "<span>" in the previous code listing are known as "magic strings" and
  • hardcoded numeric values like 6 are known as "magic numbers".
  • These "Magic" values are undesirable for many reasons and you should try to avoid them if possible.
  • Review the previous code to consider how the code might break if you hardcoded the string "<span>" multiple times in your code, but misspelled one instance of it as "<sapn>".
  • The compiler doesn’t catch "<sapn>" at compile time because the value is in a string.
  • The misspelling leads to problems at run time, and depending on the complexity of your code, it might be difficult to track down.
  • Furthermore, if you change the string "<span>" to the shorter "<div>", but forget to change the number 6 to 5, then your code produces undesirable results.
1
2
3
4
5
6
7
8
9
10
11
string message = "What is the value <span>between the tags</span>?";

const string openSpan = "<span>";
const string closeSpan = "</span>";

int openingPosition = message.IndexOf(openSpan);
int closingPosition = message.IndexOf(closeSpan);

openingPosition += openSpan.Length;
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
  • Take a minute to examine the updated code and the use of the keyword const as used in const string openSpan = "<span>";.
  • The code uses a constant with the const keyword.
  • A constant allows you to define and initialize a variable whose value can never be changed.
  • You would then use that constant in the rest of the code whenever you needed that value.
  • This ensures that the value is only defined once and misspelling the const variable is caught by the compiler.
  • Now, if the value of openSpan changes to <div>, the line of code that uses the Length property continues to be valid.

Example 3: Using the LastIndexOf() Method to retrieve the last occurrence of a sub string

  • You increase the complexity of the message variable by adding many sets of parentheses, then write code to retrieve the content inside the last set of parentheses.
1
2
3
4
5
6
7
string message = "(What if) I am (only interested) in the last (set of parentheses)?";
int openingPosition = message.LastIndexOf('(');

openingPosition += 1;
int closingPosition = message.LastIndexOf(')');
int length = closingPosition - openingPosition;
Console.WriteLine(message.Substring(openingPosition, length));
  • Output:
1
set of parentheses
  • The key to this example is the use of LastIndexOf(), which you use to get the positions of the last opening and closing parentheses.

Example 4: Using the Substring() Method to retrieve all instances of substrings inside parentheses

  • add a while statement to iterate through the string until all sets of parentheses are discovered, extracted, and displayed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
string message = "(What if) there are (more than) one (set of parentheses)?";
while (true)
{
    int openingPosition = message.IndexOf('(');
    if (openingPosition == -1) break;

    openingPosition += 1;
    int closingPosition = message.IndexOf(')');
    int length = closingPosition - openingPosition;
    Console.WriteLine(message.Substring(openingPosition, length));

    // Note the overload of the Substring to return only the remaining
    // unprocessed message:
    message = message.Substring(closingPosition + 1);
}
  • Output:
1
2
3
What if
more than
set of parentheses
  • When you use Substring() without specifying a length input parameter, it will return every character after the starting position you specify.
  • With the string being processed, message = "(What if) there are (more than) one (set of parentheses)?", there’s an advantage to removing the first set of parentheses (What if) from the value of message.
  • What remains is then processed in the next iteration of the while loop.
  • The IndexOf() method returns -1 if it can’t find the input parameter in the string.
  • You merely check for the value -1 and break out of the loop.

Example 5: Using the IndexOfAny() Method to work with different types of symbol sets

  • Update the message string, adding different types of symbols like square [] brackets and curly braces {}.
  • To search for multiple symbols simultaneously, use IndexOfAny().
  • You search with IndexOfAny() to return the index of the first symbol from the array openSymbols found in the message string.
1
2
3
4
5
6
7
8
9
string message = "Help (find) the {opening symbols}";
Console.WriteLine($"Searching THIS Message: {message}");
char[] openSymbols = { '[', '{', '(' };
int startPosition = 5;
int openingPosition = message.IndexOfAny(openSymbols);
Console.WriteLine($"Found WITHOUT using startPosition: {message.Substring(openingPosition)}");

openingPosition = message.IndexOfAny(openSymbols, startPosition);
Console.WriteLine($"Found WITH using startPosition {startPosition}:  {message.Substring(openingPosition)}");
  • output:
1
2
3
Searching THIS message: Help (find) the {opening symbols}
Found WITHOUT using startPosition: (find) the {opening symbols}
Found WITH using startPosition 5:  (find) the {opening symbols}
  • You used IndexOfAny() without, and then with, the starting position overload.
  • Now that you found an opening symbol, you need to find its matching closing symbol.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
string message = "(What if) I have [different symbols] but every {open symbol} needs a [matching closing symbol]?";

// The IndexOfAny() helper method requires a char array of characters.
// You want to look for:

char[] openSymbols = { '[', '{', '(' };

// You'll use a slightly different technique for iterating through
// the characters in the string. This time, use the closing
// position of the previous iteration as the starting index for the
//next open symbol. So, you need to initialize the closingPosition
// variable to zero:

int closingPosition = 0;

while (true)
{
    int openingPosition = message.IndexOfAny(openSymbols, closingPosition);

    if (openingPosition == -1) break;

    string currentSymbol = message.Substring(openingPosition, 1);

    // Now  find the matching closing symbol
    char matchingSymbol = ' ';

    switch (currentSymbol)
    {
        case "[":
            matchingSymbol = ']';
            break;
        case "{":
            matchingSymbol = '}';
            break;
        case "(":
            matchingSymbol = ')';
            break;
    }

    // To find the closingPosition, use an overload of the IndexOf method to specify
    // that the search for the matchingSymbol should start at the openingPosition in the string.

    openingPosition += 1;
    closingPosition = message.IndexOf(matchingSymbol, openingPosition);

    // Finally, use the techniques you've already learned to display the sub-string:

    int length = closingPosition - openingPosition;
    Console.WriteLine(message.Substring(openingPosition, length));
}
  • Output:
1
2
3
4
What if
different symbols
open symbol
matching closing symbol

Example 6: Using the Remove() to remove characters in specific locations from a string

  • Consider the following code:
1
2
3
string data = "12345John Smith          5000  3  ";
string updatedData = data.Remove(5, 20);
Console.WriteLine(updatedData);
  • output
1
123455000  3
  • The Remove() method works similarly to the Substring() method.
  • You supply a starting position and the length to remove those characters from the string.

Example 7: Using the Replace() method to remove characters no matter where they appear in a string

  • The Replace() method is used when you need to replace one or more characters with a different character (or no character).
  • The Replace() method is different from the other methods used so far, it replaces every instance of the given characters, not just the first or last instance.
  • For example:
1
2
3
4
string message = "This--is--ex-amp-le--da-ta";
message = message.Replace("--", " ");
message = message.Replace("-", "");
Console.WriteLine(message);
  • Output:
1
This is example data

Example 8: Extract, replace, and remove data from an input string

  • In this example, you work with a string that contains a fragment of HTML. You extract data from the HTML fragment, replace some of its content, and remove other parts of its content to achieve the desired output.

  • Starter code:

1
2
3
4
5
6
7
8
9
const string input = "<div><h2>Widgets &trade;</h2><span>5000</span></div>";

string quantity = "";
string output = "";

// Your work here

Console.WriteLine(quantity);
Console.WriteLine(output);
  • Expected output:
1
2
Quantity: 5000
Output: <h2>Widgets &reg;</h2><span>5000</span>
  • Solution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
const string input = "<div><h2>Widgets &trade;</h2><span>5000</span></div>";

string quantity = "";
string output = "";

// Your work here

// Extract the "quantity"
const string openSpan = "<span>";
const string closeSpan = "</span>";

int quantityStart = input.IndexOf(openSpan) + openSpan.Length; // + length of <span> so index at end of <span> tag
int quantityEnd= input.IndexOf(closeSpan);
int quantityLength = quantityEnd - quantityStart;
quantity = input.Substring(quantityStart, quantityLength);
quantity = $"Quantity: {quantity}";

// Set output to input, replacing the trademark symbol with the registered trademark symbol
const string tradeSymbol = "&trade;";
const string regSymbol = "&reg;";
output = input.Replace(tradeSymbol, regSymbol);

// Remove the opening <div> tag
const string openDiv = "<div>";
int divStart = output.IndexOf(openDiv);
output = output.Remove(divStart, openDiv.Length);

// Remove the closing </div> tag and add "Output:" to the beginning
const string closeDiv = "</div>";
int divCloseStart = output.IndexOf(closeDiv);
output = "Output: " + output.Remove(divCloseStart, closeDiv.Length);

Console.WriteLine(quantity);
Console.WriteLine(output);
This post is licensed under CC BY 4.0 by the author.