Lenspath: editing deeply nested structures

2023 Sep 18 See all posts


Lenspath: editing deeply nested structures

Functional programming offers a variety of constructs like data structures and higher-order functions. Some of them are brought over to the imperitive languages. Streams (delayed lists), and higher-order functions like map, filter, collect are prime examples of such imports.
Once I found myself needing to unmarshal a deeply nested structure in golang – the structure involved arrays and maps, and had to be sent over the wire (serialized). Consider the following example:

{
  "elements": 1,
  "endBlock": 15084335,
  "replicaEvent": [
    {
      "data": {
        "Hash": "0x6cae5c317d2e9b2f83c831227d244cf898802da0f92225699ccc79a26d66b15e",
        "Header": {
          "baseFeePerGas": 48090598751,
          "difficulty": 11905180535994968,
          "extraData": "cG9vbGluLmNvbR+NhcRb3CgzTQ==",
          "gasLimit": 29970705,
          "gasUsed": 1141848,
  "logsBloom": [
   16, 33, 64, 0, 0, 0, 24, 0, 16, 8, 2, 0, 128, 17, 1, 0, 0, 96, 0, 0, 0, 1, 0, 0, 0, 4, 0, 0, 48, 0, 8, 0, 0, 0, 1, 0, 6, 0, 0, 1, 8, 0, 16, 0, 0, 0, 0, 4, 34, 0, 128, 0, 8, 32, 25, 16, 0, 0, 0, 64, 0, 50, 0, 0, 0, 4, 0, 0, 4, 0, 1, 32, 72, 2, 0, 40, 0, 0, 65, 96, 0, 128, 0, 0, 4, 0, 16, 0, 160, 0, 4, 1, 128, 0, 144, 2, 2, 0, 0, 0, 2, 0, 64, 0, 0, 0, 1, 0, 0, 32, 8, 66, 0, 5, 128, 64, 0, 48, 0, 0, 48, 0, 0, 48, 8, 0, 0, 64, 0, 16, 1, 2, 2, 160, 0, 0, 0, 16, 128, 16, 0, 16, 0, 0, 4, 0, 64, 1, 1, 0, 0, 8, 0, 40, 0, 64, 72, 0, 0, 0, 2, 1, 0, 0, 1, 2, 36, 2, 8, 192, 64, 0, 5, 0, 0, 0, 0, 36, 0, 0, 0, 2, 2, 1, 0, 0, 0, 4, 0, 0, 0, 0, 0, 1, 0, 2, 2, 4, 1, 8, 0, 0, 0, 8, 0, 32, 64, 64, 2, 0, 0, 0, 0, 0, 0, 16, 0, 0, 32, 0, 4, 0, 184, 1, 0, 56, 32, 8, 0, 0, 0, 0, 0, 0, 0, 0, 1, 4, 0, 1, 2, 0, 0, 0, 0, 96, 0, 64, 0, 64, 0, 0, 0, 4, 48, 2
  ],
  "difficulty": 11905180535994968,
  "number": 15084335,
  "gasLimit": 29970705,
  "gasUsed": 1141848,
  "timestamp": 1657048078,
  "extraData": "cG9vbGluLmNvbR+NhcRb3CgzTQ==",
  "mixHash": "0x5ab6399ba49cdf5f34ba2396f59d1113468e4ba7b328765e74de5c9089292968",
  "nonce": [
   47,
   141,
   202,
   109,
   90,
   237,
   119,
   225
  ],
  "baseFeePerGas": 48090598751
 },
 "Transactions": [
  {
   "nonce": 14520,
   "gasPrice": 97608830985,
   "gas": 30000,
   "from": "0x0000000000000000000000000000000000000000",
   "to": "0x903c26a3d690bf010fd6441328983925852ffe4c",
   "value": 0,
   "input": ""
  },
  {
   "nonce": 178,
   "gasPrice": 60885866905,
   "gas": 727188,
   "from": "0x6b97445c501ec44bcda147d844bee0b13ec46545",
   "to": "0x3b2a8583d381845d278fb75345e39c64891b3611",
   "value": 0,
   "input": "rWrIGwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC"
  },
...

(The complete example is available here)

Before serialization, we needed to modify the structure at specific points (or paths). Initially, we used a brute-force approach of raw traversal with conditional checks, ultimately reaching the point-of-interest path leaf, and making the edit. This was blowing up into ugly code as we iterated on the structure.

// UnwrapAvroUnion "unwraps" the "to" field from the replica map
func UnwrapAvroUnion(data map[string]interface{}) map[string]interface{} {
    vs := data
    for k1 := range data {
        if k1 == "replicaEvent" {
            m1 := data[k1].([]interface{})
            vsr := m1
            for k2 := range m1 {
                m2 := m1[k2].(map[string]interface{})
                vso := m2
                for k3 := range m2 {
                    if k3 == "data" {
                        m3 := m2[k3].(map[string]interface{})
                        vsd := m3
                        for k4 := range m3 {
                            switch k4 {
                            case "Transactions":
                                m4 := m3[k4].([]interface{})
                                vst := m4
                                for k5 := range m4 {
                                    m5 := m4[k5].(map[string]interface{})
                                    vsm := make(map[string]interface{})
                                    for k6, v6 := range m5 {
                                        switch {
                                        case (k6 == "to" || k6 == "from") && v6 != nil:
                                            m6 := v6.(map[string]interface{})
                                            if v7, ok := m6["string"]; ok {
                                                vsm[k6] = v7
                                            }
                                        case (k6 == "v" || k6 == "r" || k6 == "s") && v6 != nil:
                                            m6 := v6.(map[string]interface{})
                                            if v7, ok := m6["bytes"]; ok {
                                                vsm[k6] = v7
                                            }
                                        default:
                                            vsm[k6] = v6
                                        }
                                    }
                                    vst[k5] = vsm
                                }
                                vsd[k4] = vst

                            case "Header":
                                m4 := m3[k4].(map[string]interface{})
                                vst := m4
                                for k5, v5 := range m4 {
                                    if k5 == "withdrawalsRoot" && v5 != nil {
                                        m5 := v5.(map[string]interface{})
                                        if v6, ok := m5["string"]; ok {
                                            vst[k5] = v6
                                        }
                                    }
                                }
                                vsd[k4] = vst

                            case "Withdrawals", "Uncles":
                                m4 := m3[k4].(map[string]interface{})
                                if m3[k4] == nil {
                                    vsd[k4] = nil
                                } else {
                                    vsd[k4] = m4["array"]
                                }
                            }
                        }
                        vso[k3] = vsd
                    }
                }
                vsr[k2] = vso
            }
            vs[k1] = vsr
        }
    }

    return vs
}

This is where lenspath comes in.

What is lenspath

Lenspaths or simply Lenses, captures the description of a path through an arbitrary structure. Once a lenspath is defined, one can edit the value at the end of such a lenspath. Consider this example from the golang lenspath library:

data := map[string]any{
        "name":   "chacha",
        "region": "India",
        "additional": map[string]any{
            "birthmark": "cut on the left hand",
            "addi": map[string]string{
                "code":     "334532",
                "landmark": "near the forest entry",
            },
        },
    }

codePath, _ := Create([]string{"additional", "addi", "code"})
value, _ := codePath.Get(data)
fmt.Println(value) // "334532"

codePath.Set(data, "5A-1001")
value, _ = codePath.Get(data)
fmt.Println(value) // "5A-1001"

Here, the codePath captures the path additional -> addi -> code. Note that defining this lenspath is decoupled from the actual structure that needs to be traversed. Once such a path is available, one can edit or retrieve the value using Get or Set calls. You can also have ways to compose the lenspath. For the example json given at the start of the post, we define the following list of lenspaths:

var dataLens = createLenspath([]string{"replicaEvent", "*", "data"})
var transactionsLens = composeLenspath(dataLens, []string{"Transactions", "*"})
var vLens = composeLenspath(transactionsLens, []string{"v"})
var rLens = composeLenspath(transactionsLens, []string{"r"})
var sLens = composeLenspath(transactionsLens, []string{"s"})
var toLens = composeLenspath(transactionsLens, []string{"to"})
var fromLens = composeLenspath(transactionsLens, []string{"from"})

var headerLens = composeLenspath(dataLens, []string{"Header"})
var withdrawalsRootLens = composeLenspath(headerLens, []string{"withdrawalsRoot"})

var withdrawalsLens = composeLenspath(dataLens, []string{"Withdrawals"})
var uncleLens = composeLenspath(dataLens, []string{"Uncles"})


func composeLenspath(prevLenspath *lensp.Lenspath, lens []lensp.Lens) *lensp.Lenspath {
    lenspath, err := prevLenspath.Compose(lens)
    if err != nil {
        log.Fatal(err)

        return nil
    }

    return lenspath
}

A thing to note is the * "node" when defining the dataLens path.

var dataLens = createLenspath([]string{"replicaEvent", "*", "data"})

This is a fanout operation, which indicates that the previous node (replicaEvent) is an array, and that the path traverses into all elements of replicaEvent (and then into data node).

Now, constructing such a library might be easier in a functional programming language like Haskell. Plently of higher-order operations like map and filter made their way into imperative languages because of their utility. For us, lenses was something that (if possible) could be brought into golang and give major benefits. This would need using golang reflection API.

laws of reflection in golang

I'm not going to deep dive into golang reflection here, as there's already a superb article on this. The golang reflection API provides two important sructures - reflect.Type and reflect.Value. These two structures (and the corresponding APIs they expose), allows recursive traversal along the structure (map or struct), and retrieval of value from an interface type.

the callback API in golang lenspath

The golang lenspath also provides a callback API for get and set:

var dataLens = createLenspath([]string{"replicaEvent", "*", "data"})

index = 0
err := dataLens.Setter(map, func(value any) any {
    index=index+1
    return index    ## nth leaf on the same level
})

This API allows for more complex editing/retrieval of leaf node values (the simpler Get/Set are infact expressed in terms of Setter/Getter)

refactoring and final look

The earlier messy code for editing the golang map can then be replaced by:

// UnwrapAvroUnion unwraps avro wrapped maps
func UnwrapAvroUnion(data map[string]interface{}) map[string]interface{} {
    if data == nil {
        return nil
    }

    // v, r, s
    unwrapType(data, vLens, "bytes")
    unwrapType(data, rLens, "bytes")
    unwrapType(data, sLens, "bytes")

    // to, from
    unwrapType(data, toLens, "string")
    unwrapType(data, fromLens, "string")

    // withdrawalsRoot
    unwrapType(data, withdrawalsRootLens, "string")

    // withdrawals, uncles
    unwrapType(data, withdrawalsLens, "array")
    unwrapType(data, uncleLens, "array")

    return data
}

func unwrapType(data map[string]interface{}, lenspath *lensp.Lenspath, nonNilType string) {
    lenspathSetter(data, lenspath, func(leafd any) any {
        if leafd == nil {
            return nil
        }

        mp := leafd.(map[string]interface{})

        return mp[nonNilType]
    })
}