Parse Custom Branch

It is possible to parse Branches with custom structure as long as you know how the bytes should be interpreted. As an example, the TLorentzVector is added using this mechanism and we will walk through the steps needed:

1. Provide a map between fClassName of your struct (as seen in .root) to a Julia type.

Pass a Dict{String, Type} to ROOTFile(filepath; customstructs). The TLorentzVector is shipped by default:

ROOTFile(filepath; customstructs = Dict("TLorentzVector" => LorentzVector{Float64}))

This Dict will subsequently be used by the auto_T_JaggT function at here such that when we encounter a branch with this fClassName, we will return your Type as the detected element type of this branch.

2. Extend the raw bytes interpreting function UnROOT.interped_data

By default, given a branch element type and a "jaggness" type, a general function is defined which will try to parse the raw bytes into Julia data structure. The ::Type{T} will match what you have provided in the Dict in the previous step.

Thus, to "teach" UnROOT how to interpret bytes for your type T, you would want to defined a more specific UnROOT.interped_data than the default one. Taking the TLorentzVector as example again, we define a function:

using LorentzVector
const LVF64 = LorentzVector{Float64}
function UnROOT.interped_data(rawdata, rawoffsets, ::Type{LVF64}, ::Type{J}) where {T, J <: JaggType}
    # `rawoffsets` is actually redundant, since we know each TLV is always 64 bytes (withe 32 bytes header)
    [
     reinterpret(LVF64, x) for x in Base.Iterators.partition(rawdata, 64)
    ]
end

# VorView is defined in the `src/custom.jl`
function Base.reinterpret(::Type{LVF64}, v::VorView) where T
    # x,y,z,t in ROOT
    v4 = ntoh.(reinterpret(Float64, v[1+32:end]))
    # t,x,y,z in LorentzVectors.jl
    LVF64(v4[4], v4[1], v4[2], v4[3])
end

The Base.reinterpret function is just a helper function, you could instead write everything inside UnROOT.interped_data. We then builds on these, to interpret Jagged TLV branch: https://github.com/JuliaHEP/UnROOT.jl/blob/4747f6f5fd97ed1a872765485b4eb9e99ec5a650/src/custom.jl#L47

More details

To expand a bit what we're doing here, the rawdata for a single TLV is always 64 bytes long and the first 32 bytes are TObject header which we don't care (which is why we don't care about rawoffsets here). The last 32 bytes make up 4 Float64 and we simply parse them and return a collection of (julia) LorentzVector{Float64}.

In general, if auto_T_JaggT returned MyType as promised branch element type, then

UnROOT.interped_data(rawdata, rawoffsets, ::Type{MyType},

should return Vector{MyType} because UnROOT.interped_data receives raw bytes of a basket at a time.

And that's it! Afterwards both LazyBranch and LazyTree will be able to constructed with correct type and also knows how to interpret bytes when you indexing or iterating through them

Reading Raw Data from Branch

Alternatively, reading raw data is also possible using the UnROOT.array(f::ROOTFile, path; raw=true) method. The output can be then reinterpreted using a custom type with the method UnROOT.splitup(data, offsets, T::Type; skipbytes=0, jagged=true). This provides more fine grain control in case your branch is highly irregular. You can then define suitable Julia type and readtype method for parsing these data. Alternatively, you can of course parse the data and offsets entirely manually. Here is it in action, with the help of the types from custom.jl, and some data from the KM3NeT experiment:

julia> using UnROOT

julia> f = ROOTFile("test/samples/km3net_online.root")
ROOTFile("test/samples/km3net_online.root") with 10 entries and 41 streamers.

julia> data, offsets = array(f, "KM3NET_EVENT/KM3NET_EVENT/snapshotHits"; raw=true)
2058-element Array{UInt8,1}:
 0x00
 0x03
   ⋮
   
julia> UnROOT.splitup(data, offsets, UnROOT._KM3NETDAQHit)
4-element Vector{Vector{UnROOT._KM3NETDAQHit}}:
 [UnROOT._KM3NETDAQHit(1073742790, 0x00, 9, 0x60)......