James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 136 , comments - 1089 , trackbacks - 0

My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in Seattle, WA. I've been doing C++/C#/Java development for over 18 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

Blogs I Read

MCC Logo MVP Logo

Follow BlkRabbitCoder on Twitter

Tag Cloud

Archives

Post Categories

C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders post can be found here.

Two weeks ago I decided to stop my Little Wonders in the String class, but I recanted and decided to do one more before wrapping up String.  So today we’ll look at ways to find a out if a given source String has a target String inside of it (and where).

IndexOf() / LastIndexOf() – finds position of a contained string in a string

Let’s start with the most general of the search methods, which tends to be the main one most people are aware of: IndexOf().

As you probably surmised (or knew) IndexOf() returns the position in the instance String of a given String, and returns –1 if it could not be found.  By default, this method searches from the beginning, though you can also pass in a starting position to begin your search as well.

For example, the search loop below will find all occurrences of “string” (how original, right?) in text from front to back, yielding positions 21 and then 41:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: // if no start is defined, zero is assumed.
   4: int pos = text.IndexOf("string");
   5:  
   6: while (pos != -1)
   7: {
   8:     // will look for "string" regardless of the case of any char and find it at position 21
   9:     Console.WriteLine("Found \"string\" at position " + pos);
  10:  
  11:     // if start is defined, search starts at that char, so want to go at least one past last found
  12:     pos = text.IndexOf("string", pos + 1);
  13: }

IndexOf() has an opposite, LastIndexOf(), which performs the same search in reverse.  That is, it returns the position in a String instance of the start of the last occurrence of a given StringThis method, by default, searches from the end of the String, though like its forward counterpart you can also pass in a starting position to begin your reverse search as well.

For example, the loop below will find all occurrences of the “string” in text, starting from the back of the string moving forward, yielding finds at position 41 and then 21:

   1: // an example string to search...
   2: const string text = "This is a very large string to find a substring contained within.";
   3:  
   4: // if no start is defined, last char position is assumed.
   5: int pos = text.LastIndexOf("string");
   6:  
   7: while (pos != -1)
   8: {
   9:     // will look for "string" regardless of the case of any char and find it at position 21
  10:     Console.WriteLine("Found \"string\" at position " + pos);
  11:  
  12:     // if start is defined, search starts at that char, so want to go at least one before last found
  13:     pos = text.LastIndexOf("string", pos - 1);
  14: }

By default, IndexOf()/LastIndexOf() are case-sensitive, but all of the IndexOf()/LastIndexOf() support overloads that let you search the String instance in a case insensitive manner, by providing a StringComparison:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: int pos = text.IndexOf("sTrInG", StringComparison.CurrentCultureIgnoreCase);
   4:  
   5: if (pos != -1)
   6: {
   7:     // will look for "string" regardless of the case of any char and find it at position 21
   8:     Console.WriteLine("Found \"sTrInG\" case-insensitively at position " + pos);
   9: }

These two method families are very exhaustive and also have options for just searching the first/last X characters of a String, searching for a char in a String, or searching for one of any chars in a String.

StartsWith() – determines if a string has a given prefix

While IndexOf() is a very general case, StartsWith() is much more specific.  It looks to see if an instance starts with the given String.  Obviously, all we need to know is true/false because we know the index of the found string would be zero.

So this is a handy, more readable short-cut if your only goal is to see if a string starts with a given prefix, either case-sensitive or case-insensitive:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: // case-sensitive
   4: if (text.StartsWith("This"))
   5: {
   6:     Console.WriteLine("The text starts with \"This\"");
   7: }
   8:  
   9: // case-insensitive
  10: if (text.StartsWith("this", StringComparison.CurrentCultureIgnoreCase))
  11: {
  12:     Console.WriteLine("The text starts with case-insensitive \"this\"");
  13: }

This is logically equivalent as performing an IndexOf() and checking to see if it was found at position zero:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: // case-sensitive
   4: if (text.IndexOf("This") == 0)
   5: {
   6:     Console.WriteLine("The text starts with \"This\"");
   7: }
   8:  
   9: // case-insensitive
  10: if (text.IndexOf("this", StringComparison.CurrentCultureIgnoreCase) == 0)
  11: {
  12:     Console.WriteLine("The text starts with case-insensitive \"this\"");
  13: }

So you can use the StartsWith() method when you want to see when an instance has a given prefix, and it will be somewhat more readable than IndexOf().

EndsWith() – determines if a string has a given suffix

Similar to its opposite, StartsWith(), the EndsWith() method checks to see if an instance ends with the specified String:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: // case-sensitive
   4: if (text.EndsWith("in."))
   5: {
   6:     Console.WriteLine("The text ends with \"in.\"");
   7: }
   8:  
   9: // case-insensitive
  10: if (text.EndsWith("IN.", StringComparison.CurrentCultureIgnoreCase))
  11: {
  12:     Console.WriteLine("The text ends with case-insensitive \"IN.\"");
  13: }

Once again, this would be equivalent to performing an LastIndexOf() and checking for the position to be the length of the instance minus the length of the String to find:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: // case-sensitive
   4: if (text.LastIndexOf("in.") == (text.Length - "in.".Length))
   5: {
   6:     Console.WriteLine("The text ends with \"in.\"");
   7: }
   8:  
   9: // case-insensitive
  10: if (text.LastIndexOf("IN.", StringComparison.CurrentCultureIgnoreCase) == 
  11:     (text.Length - "IN.".Length))
  12: {
  13:     Console.WriteLine("The text ends with case-insensitive \"IN.\"");
  14: }

But once again, the EndsWith() method is (especially in this case) much easier to read and understand.

Contains() – determines if a string contains another string

This sounds a lot like IndexOf() right?  Well, not quite.  IndexOf() finds the position of a contained String, all Contains() is concerned with is whether the instance contains the given String at all:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: if (text.Contains("string"))
   4: {
   5:     Console.WriteLine("The text contains the word \"string\".");
   6: }

So, obviously, this is logically equivalent of calling IndexOf() and checking for the returned position to not be –1:

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: if (text.IndexOf("string") != -1)
   4: {
   5:     Console.WriteLine("The text contains the word \"string\".");
   6: }

So, now for the case-insensitive form of Contains()… um…  well, that doesn’t currently exist.  There is a LINQ Contains() extension method that takes a comparer, but when applied to String, it checks to see if a String contains a char, not another String.

That said, we can easily write an extension method to “add” a case-insensitive Contains() to our String class:

   1: // adding a Contains() extension method to string that checks case-insensitively.
   2: public static bool Contains(this string source, string target, StringComparison comparer)
   3: {
   4:     // good extension methods should not accept null (unless the name implies it can)
   5:     if (source == null)
   6:     {
   7:         throw new ArgumentNullException("source");
   8:     }
   9:  
  10:     return source.IndexOf(target, comparer) != -1;
  11: }

This would allow us to be able to call Contains() either in a case-sensitive (instance method of String) or case-insensitive (our extension method of String):

   1: const string text = "This is a very large string to find a substring contained within.";
   2:  
   3: if (text.Contains("StRiNg", StringComparison.CurrentCultureIgnoreCase))
   4: {
   5:     Console.WriteLine("The text contains (case-insensitive) the word \"StRiNg\".");
   6: }

Summary

So this is the final “Little Wonder” in the String class that we will examine in this series.  And, in addition, we’ve given you a “C# Toolbox” item as well (a case-insensitive Contains() extension method)!

Next I think I’ll take a look at some of the “Little Wonders” of the Task Parallel Library (TPL) in .NET 4.0.  I’ve already talked about the concurrent collections, but now would like to focus on the Task and Parallel classes, among others.

 

Technorati Tags: ,,,,,,,,,,

Print | posted on Thursday, October 13, 2011 6:23 PM | Filed Under [ My Blog C# Software .NET Little Wonders Toolbox ]

Feedback

Gravatar

# re: C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

What a great way for doing a case insensitive contains without having to revert to regex. Nice.
10/14/2011 10:14 AM | jorisdries
Gravatar

# re: C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

@jorisdries: Thanks! RegEx does have a lot of power, of course, and compiled RegEx can be quite fast, but Contains() is very simple and easy to maintain. One of the many trade-offs of power vs maintainability :-)
10/14/2011 10:35 AM | James Michael Hare
Gravatar

# re: C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

I usually use RegEx until a friend of mine told me about Contains(), which in my opinion is more versatile and simple. After all, everyone like simple things because they are more stable and faster. You cannot imagine how long it took me to search for string prefixes, with StartsWith() its much easier and faster.
5/3/2012 10:29 AM | activex
Gravatar

# re: C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

@activex: each is useful in its own way, the main thing is to use each tool for its intended purpose.

Regex is great for pattern matching so that's where I tend to use it, but if i just want a simple StartsWith, EndsWith, Contains, then you really don't need pattern matching and thus the string methods are easier to read and generally more performant.

On the other hand, if i wanted to match an IP address, SSN, or any formatted data, I'd probably prefer a RegEx.
5/3/2012 1:05 PM | James Michael Hare
Gravatar

# re: C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

Here's a blog which benchmarks several different C# techniques. Makes for a great read, especially for those who are looking for the *fastest* way:

http://blogs.davelozinski.com/curiousconsultant/csharp-net-fastest-way-to-check-if-a-string-occurs-within-a-string
8/26/2013 7:37 PM | FireMyst
Gravatar

# re: C#/.NET Little Wonders: Searching Strings With Contains(), StartsWith(), EndsWith(), and IndexOf()

@FireMyst: *fastest* way can often be misleading. Something that's faster now may change with each new version of the .NET framework as enhancements and optimizations made.

Hence I usually suggest you should pick the *correct* way first and *only* perform optimization when you can demonstrate that the area of code in question is a significant bottleneck to the overall performance of the system.

Remember: it's easier to make a correct program fast than to make a fast program correct.

-James
8/26/2013 7:50 PM | James Michael Hare
Post A Comment
Title:
Name:
Email:
Comment:
Verification:
 
 

Powered by: