The brute force method:
Run two loops with variables i and j. Compare str[i] and str[j]. If they become equal at any point, return false.
For characters which are represented by one byte in UTF-8.
func hasUniqueCharacters1(str string) (string, bool) {
for i := 0; i < len(str); i++ {
for j := i + 1; j < len(str); j++ {
if str[i] == str[j] {
return str, false
}
}
}
return str, true
}
To support Unicode, you must convert the string to []rune
func hasUniqueCharacters2(str string) (string, bool) {
chars := []rune(str)
length := len(chars)
for i := 0; i < length; i++ {
for j := i + 1; j < length; j++ {
if chars[j] == chars[i] {
return str, false
}
}
}
return str, true
}
Sorting values
To avoid having to implement methods of the sort.Interface, you can use the ready-made sortutil package.
func hasUniqueCharacters3(str string) (string, bool) {
chars := sortutil.RuneSlice(str)
length := len(chars)
sort.Sort(chars)
for i := 0; i < length-1; i++ {
if chars[i] == chars[i+1] {
return str, false
}
continue
}
return str, true
}
A variant of sorting without implementing interface methods.
func hasUniqueCharacters4(str string) (string, bool) {
chars := []rune(str)
length := len(chars)
sort.Slice(chars, func(i int, j int) bool {
return chars[i] < chars[j]
})
for i := 0; i < length-1; i++ {
if chars[i] == chars[i+1] {
return str, false
}
continue
}
return str, true
}
Using an additional data structure
Using big.Int, to store a set of bits:
func hasUniqueCharacters5(str string) (string, bool) {
var bits big.Int
for _, char := range str {
val := int(char)
if bits.Bit(val) != 0 {
return str, false
}
bits.SetBit(&bits, val, 1)
}
return str, true
}
Using map with boolean values
func hasUniqueCharacters6(str string) (string, bool) {
var bits = make(map[int]bool, len(str))
for _, char := range str {
val := int(char)
if bits[val] {
return str, false
}
bits[val] = true
}
return str, true
}
If you need only ASCII values (8 bits).
We need a slice with logical values of a given length – 256. By default all elements of the slice will be equal to false.
const MAX_CHAR = 256
func hasUniqueCharacters7(str string) (string, bool) {
// If the size is larger than 256,
// then any characters must have been repeated
if len(str) > MAX_CHAR {
return str, false
}
chars := make([]bool, MAX_CHAR)
for _, char := range str {
val := int(char)
if chars[val] {
return str, false
}
chars[val] = true
}
return str, true
}
Using bitwise operators
Suitable for characters in the range a-z in the ASCII table.
Instead of using arrays, let’s take an int value.
Each bit in an int value can be treated as a flag, so the int ends up being an array of bits (a flag).
int has a fixed size, usually 4 bytes, which means 8 * 4 = 32 bits (flags).
When we iterate over the string, we find the int value of the character with respect to ‘a’.
Then a bit in that int value is set to 1 by a bitwise shift to the left 1 << val .
Then we check if a bit is already set in this value, using the AND & operator. Return false.
The bit mask. 10111011 & 00001000 = 00001000
If a bit is not set in the given place, we update the checker value, using operator OR |
Merging.
00000001 | 00001000 = 00001001
func hasUniqueCharacters8(str string) (string, bool) {
checker := 0
for _, char := range str {
index := char - 'a'
val := 1 << index
if checker & val > 0 {
return str, false
}
checker |= val
}
return str, true
}
Tests
In tests, the string abcdefghijklmnopqrstuvwxyz is passed to the function arguments.
go test -bench BenchmarkCountLetters -benchtime=10000000x -benchmem
goos: windows
goarch: amd64
pkg: benching
cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkCountLetters/Test_1-8 10000000 174.0 ns/op 0 B/op 0 allocs/op
BenchmarkCountLetters/Test_2-8 10000000 310.7 ns/op 0 B/op 0 allocs/op
BenchmarkCountLetters/Test_3-8 10000000 619.9 ns/op 136 B/op 2 allocs/op
BenchmarkCountLetters/Test_4-8 10000000 756.2 ns/op 168 B/op 3 allocs/op
BenchmarkCountLetters/Test_5-8 10000000 400.2 ns/op 48 B/op 1 allocs/op
BenchmarkCountLetters/Test_6-8 10000000 1653 ns/op 459 B/op 3 allocs/op
BenchmarkCountLetters/Test_7-8 10000000 37.31 ns/op 0 B/op 0 allocs/op
BenchmarkCountLetters/Test_8-8 10000000 55.49 ns/op 0 B/op 0 allocs/op
The code and performance tests can be found on the github.