Skip to content

Instantly share code, notes, and snippets.

@jskorpan
Forked from cgbystrom/StringTokenizer.java
Created June 30, 2011 11:37
Show Gist options
  • Save jskorpan/1056060 to your computer and use it in GitHub Desktop.
Save jskorpan/1056060 to your computer and use it in GitHub Desktop.
Ultra fast Java string tokenizer
public class StringTokenizer {
private static ThreadLocal<String[]> tempArray = new ThreadLocal<String[]>();
public static String[] tokenize(String string, char delimiter)
{
String[] temp = tempArray.get();
int tempLength = (string.length() / 2) + 2;
if (temp == null || temp.length < tempLength)
{
temp = new String[tempLength];
tempArray.set(temp);
}
int wordCount = 0;
int i = 0;
int j = string.indexOf(delimiter);
while (j >= 0)
{
temp[wordCount++] = string.substring(i, j);
i = j + 1;
j = string.indexOf(delimiter, i);
}
temp[wordCount++] = string.substring(i);
String[] result = new String[wordCount];
System.arraycopy(temp, 0, result, 0, wordCount);
return result;
}
}
@jskorpan
Copy link
Author

jskorpan commented Jul 1, 2011

Fixed the ArrayIndexOutOfBounds

@yossale
Copy link

yossale commented May 24, 2012

Why did you decide to use a ThreadLocal class variable for the tempArray instead of declaring it inside the function?

@jskorpan
Copy link
Author

jskorpan commented May 24, 2012 via email

@hrzafer
Copy link

hrzafer commented Dec 30, 2013

Can you provide a version that supports multiple delim chars?

@asgs
Copy link

asgs commented Jul 11, 2017

why/how is this ultra fast?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment