How to Read/Iterate Each Character in a String in Go

In this blog, we’ll cover the different ways to read characters in a Go string and explain when to use each approach.


Strings in Go: A Quick Primer

In Go, a string is an immutable sequence of bytes. While this works well for ASCII strings, it requires special handling for multibyte Unicode characters. Each Unicode character is represented as a rune, which is Go’s type for a Unicode code point.

For example:

s := "Hello, 世界"
fmt.Println(len(s)) // Outputs 13

The string has 13 bytes because the Chinese characters “世” and “界” are represented as multibyte sequences. To correctly process such strings, we must consider their rune representation.


Methods to Read Each Character in a String

1. Using for range to Read Characters

The easiest and most common way to read each character in a string is by using a for range loop. This loop iterates over the string, decoding each character (rune) automatically.

Example:

package main

import "fmt"

func main() {
    s := "Hello, 世界"
    
    for i, char := range s {
        fmt.Printf("Index: %d, Character: %c\n", i, char)
    }
}

Output:

Index: 0, Character: H
Index: 1, Character: e
Index: 2, Character: l
Index: 3, Character: l
Index: 4, Character: o
Index: 5, Character: ,
Index: 6, Character:  
Index: 7, Character: 世
Index: 10, Character: 界

Why Use It?

  • It correctly handles multibyte characters.
  • The i variable represents the byte index of the character in the string.
  • The char variable contains the Unicode code point (rune).

2. Accessing Bytes Directly

If you’re working with ASCII strings or raw binary data, you can read each byte individually.

Example:

package main

import "fmt"

func main() {
    s := "Hello, 世界"
    
    for i := 0; i < len(s); i++ {
        fmt.Printf("Index: %d, Byte: %x\n", i, s[i])
    }
}

Output:

Index: 0, Byte: 48
Index: 1, Byte: 65
Index: 2, Byte: 6c
...
Index: 7, Byte: e4
Index: 8, Byte: b8
Index: 9, Byte: 96
Index: 10, Byte: e7
Index: 11, Byte: 95
Index: 12, Byte: 8c

Why Use It?

  • It’s fast and efficient for binary or ASCII data.
  • Multibyte characters are broken into individual bytes, so this method is unsuitable for processing Unicode strings.

3. Converting to a Rune Slice

If you need direct access to Unicode characters and their indexes, you can convert the string to a slice of runes.

Example:

package main

import "fmt"

func main() {
    s := "Hello, 世界"
    
    runes := []rune(s)
    for i, r := range runes {
        fmt.Printf("Index: %d, Rune: %c\n", i, r)
    }
}

Output:

Index: 0, Rune: H
Index: 1, Rune: e
Index: 2, Rune: l
Index: 3, Rune: l
Index: 4, Rune: o
Index: 5, Rune: ,
Index: 6, Rune: 世
Index: 7, Rune: 界

Why Use It?

  • The []rune conversion ensures proper handling of Unicode characters.
  • The i variable represents the index of the character in the rune slice, not the byte index.

When to Use Each Approach

MethodUse Case
for range loopWhen you want to iterate over characters (runes) properly, including multibyte characters.
Accessing bytes directlyWhen working with raw bytes, binary data, or ASCII strings.
Converting to rune sliceWhen you need random access to Unicode characters or their indexes.