APIs · UnROOT.jl

Commonly used

UnROOT.LazyBranch — Type

LazyBranch(f::ROOTFile, branch)

Construct an accessor for a given branch such that BA[idx] and or BA[1:20] is type-stable. And memory footprint is a single basket (<1MB usually). You can also iterate or map over it. If you want a concrete Vector, simply collect() the LazyBranch.

Example

julia> rf = ROOTFile("./test/samples/tree_with_large_array.root");

julia> b = rf["t1/int32_array"];

julia> ab = UnROOT.LazyBranch(rf, b);

julia> for entry in ab
           @show entry
           break
       end
entry = 0

julia> ab[begin:end]
0
1
...

UnROOT.LazyTree — Method

LazyTree(f::ROOTFile, s::AbstractString, branch::Union{AbstractString, Regex})
LazyTree(f::ROOTFile, s::AbstractString, branch::Vector{Union{AbstractString, Regex}})

Constructor for LazyTree, which is close to an DataFrame (interface wise), and a lazy Table (speed wise). Looping over a LazyTree is fast and type stable. Internally, LazyTree contains a typed table whose branch are LazyBranch. This means that at any given time only N baskets are cached, where N is the number of branches.

Note

Accessing with [start:stop] will return a LazyTree with concrete internal table.

Warning

Split branches are re-named, and the exact renaming may change. See Issue 156 for context.

Example

julia> mytree = LazyTree(f, "Events", ["Electron_dxy", "nMuon", r"Muon_(pt|eta)$"])
 Row │ Electron_dxy     nMuon   Muon_eta         Muon_pt
     │ Vector{Float32}  UInt32  Vector{Float32}  Vector{Float32}
─────┼───────────────────────────────────────────────────────────
 1   │ [0.000371]       0       []               []
 2   │ [-0.00982]       2       [0.53, 0.229]    [19.9, 15.3]
 3   │ []               0       []               []
 4   │ [-0.00157]       0       []               []
 ⋮   │     ⋮            ⋮             ⋮                ⋮

UnROOT.LazyTree — Method

function LazyTree(f::ROOTFile, tree::TTree, treepath, branches; sink = LazyTree)

Creates a lazy tree object of the selected branches only. branches is vector of String, Regex or Pair{Regex, SubstitutionString}, where the first item is the regex selector and the second item the rename pattern. An alternative container can be used by providing a sink function. The sink function must take as argument an table with a Tables.jl interface. The table columns are filled with LazyBranch objects.

More Internal

UnROOT.Cursor — Type

The Cursor type is embedded into Branches of a TTree such that when we need to read the content of a Branch, we don't need to go through the Directory and find the TKey and then seek to where the Branch is.

Note

The io inside a Cursor is in fact only a buffer, it is NOT a io that refers to the whole file's stream.

UnROOT.LeafField — Type

struct LeafField{T}
    content_col_idx::Int
    columnrecord::ColumnRecord
end

Base case of field nesting, this links to a column in the RNTuple by 0-based index. T is the eltype of this field which mostly uses Julia native types except for Switch.

The type field is the RNTuple spec type number, used to record split encoding.

UnROOT.OffsetBuffer — Type

OffsetBuffer

Works with seek, position of the original file. Think of it as a view of IOStream that can be indexed with original positions.

UnROOT.Preamble — Method

Reads the preamble of an object.

The cursor will be put into the right place depending on the data.

UnROOT.RNTuple — Type

RNTuple

This is the struct for holding all metadata (schema) needed to completely describe and RNTuple from ROOT, just like TTree, to obtain a table-like data object, you need to use LazyTree explicitly:

Example

julia> f = ROOTFile("./test/samples/RNTuple/test_ntuple_stl_containers.root");

julia> f["ntuple"]
UnROOT.RNTuple:
  header:
    name: "ntuple"
    ntuple_description: ""
    writer_identifier: "ROOT v6.29/01"
    schema:
      RNTupleSchema with 13 top fields
      ├─ :lorentz_vector ⇒ Struct
      ├─ :vector_tuple_int32_string ⇒ Vector
      ├─ :string ⇒ String
      ├─ :vector_string ⇒ Vector
...
..
.

julia> LazyTree(f, "ntuple")
 Row │ string  vector_int32     array_float      vector_vector_i     vector_string       vector_vector_s     variant_int32_s  vector_variant_     ⋯
     │ String  Vector{Int32}    StaticArraysCor  Vector{Vector{I     Vector{String}      Vector{Vector{S     Union{Int32, St  Vector{Union{In     ⋯
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1   │ one     [1]              [1.0, 1.0, 1.0]  Vector{Int32}[Int3  ["one"]             [["one"]]           1                Union{Int64, Strin  ⋯
 2   │ two     [1, 2]           [2.0, 2.0, 2.0]  Vector{Int32}[Int3  ["one", "two"]      [["one"], ["two"]]  two              Union{Int64, Strin  ⋯
 3   │ three   [1, 2, 3]        [3.0, 3.0, 3.0]  Vector{Int32}[Int3  ["one", "two", "th  [["one"], ["two"],  three            Union{Int64, Strin  ⋯
 4   │ four    [1, 2, 3, 4]     [4.0, 4.0, 4.0]  Vector{Int32}[Int3  ["one", "two", "th  [["one"], ["two"],  4                Union{Int64, Strin  ⋯
 5   │ five    [1, 2, 3, 4, 5]  [5.0, 5.0, 5.0]  Vector{Int32}[Int3  ["one", "two", "th  [["one"], ["two"],  5                Union{Int64, Strin  ⋯
                                                                                                                                  5 columns omitted

UnROOT.RNTupleCardinality — Type

struct RNTupleCardinality{T}
    content_col_idx::Int
    nbits::Int
end

Special field. The cardinality is basically a counter, but the data column is a leaf column of Index32 or Index64. To get a number from Cardinality, one needs to compute ary[i] - ary[i-1].

UnROOT.RNTupleField — Type

mutable struct RNTupleField{R, F, O, E} <: AbstractVector{E}

Not a counterpart of RNTuple field in ROOT. This is a user-facing Julia-only construct like LazyBranch that is meant to act like a lazy AbstractVector backed with file IO source and a schema field from RNTuple.schema.

R is the type of parent RNTuple
F is the type of the field in the schema
'O' is the type of output when you read a cluster-worth of data
'E' is the element type of O (i.e. what you get for each event (row) in iteration)

UnROOT.RNTupleSchema — Type

struct RNTupleSchema

A wrapper struct for print_tree implementation of the schema display.

Example

julia> f = ROOTFile("./test/samples/RNTuple/test_ntuple_stl_containers.root");

julia> f["ntuple"].schema
RNTupleSchema with 13 top fields
├─ :lorentz_vector ⇒ Struct
│                    ├─ :pt ⇒ Leaf{Float32}(col=26)
│                    ├─ :eta ⇒ Leaf{Float32}(col=27)
│                    ├─ :phi ⇒ Leaf{Float32}(col=28)
│                    └─ :mass ⇒ Leaf{Float32}(col=29)
├─ :vector_tuple_int32_string ⇒ Vector
│                               ├─ :offset ⇒ Leaf{Int32}(col=9)
│                               └─ :content ⇒ Struct
│                                             ├─ :_1 ⇒ String
│                                             │        ├─ :offset ⇒ Leaf{Int32}(col=37)
│                                             │        └─ :content ⇒ Leaf{Char}(col=38)
│                                             └─ :_0 ⇒ Leaf{Int32}(col=36)
├─ :string ⇒ String
│            ├─ :offset ⇒ Leaf{Int32}(col=1)
│            └─ :content ⇒ Leaf{Char}(col=2)
├─ :vector_string ⇒ Vector
│                   ├─ :offset ⇒ Leaf{Int32}(col=5)
│                   └─ :content ⇒ String
│                                 ├─ :offset ⇒ Leaf{Int32}(col=13)
│                                 └─ :content ⇒ Leaf{Char}(col=14)
...
..
.

UnROOT.ROOTFile — Method

ROOTFile(filename::AbstractString; customstructs = Dict("TLorentzVector" => LorentzVector{Float64}))

ROOTFile's constructor from a file. The customstructs dictionary can be used to pass user-defined struct as value and its corresponding fClassName (in Branch) as key such that UnROOT will know to interpret them, see interped_data.

See also: LazyTree, LazyBranch

Example

julia> f = ROOTFile("test/samples/NanoAODv5_sample.root")
ROOTFile with 2 entries and 21 streamers.
test/samples/NanoAODv5_sample.root
└─ Events
   ├─ "run"
   ├─ "luminosityBlock"
   ├─ "event"
   ├─ "HTXS_Higgs_pt"
   ├─ "HTXS_Higgs_y"
   └─ "⋮"

UnROOT.StdArrayField — Type

StdArrayField<N, T>

Special base-case field for a leaf field representing std::array<T, N>. This is because RNTuple would serialize it as a leaf field but with flags == 0x0001 in the field description. In total, there are two field descriptions associlated with array<>, one for meta-data (the N), the other one for the actual data.

UnROOT.Streamers — Method

function Streamers(io)

Reads all the streamers from the ROOT source.

UnROOT.StringField — Type

StringField

Special base-case field for String leaf field. This is because RNTuple splits a leaf String field into two columns (instead of split in field records). So we need an offset column and a content column (that contains Chars).

UnROOT.TH — Method

TH(io, tkey::TKey, refs)

Internal function used to form a fields = Dict{Symbol, Any}() that represents the fields of a TH (histogram) in C++ ROOT.

UnROOT._field_output_type — Method

_field_output_type(::Type{F}) where F

This is function is used in two ways:

provide a output type prediction for each "field" in RNTuple so we can

achieve type stability

it's also used to enforce the type stability in read_field:

    # this is basically a type assertion for `res`
    return res::_field_output_type(field)

UnROOT._rntuple_clusterrange — Method

The event number range a given cluster covers, in Julia's index

UnROOT.array — Method

array(f::ROOTFile, path; raw=false)

Reads an array from a branch. Set raw=true to return raw data and correct offsets.

UnROOT.arrays — Method

arrays(f::ROOTFile, treename)

Reads all branches from a tree.

UnROOT.auto_T_JaggT — Method

auto_T_JaggT(f::ROOTFile, branch; customstructs::Dict{String, Type})

Given a file and branch, automatically return (eltype, Jaggtype). This function is aware of custom structs that are carried with the parent ROOTFile.

This is also where you may want to "redirect" classname -> Julia struct name, for example "TLorentzVector" => LorentzVector here and you can focus on LorentzVectors.LorentzVector methods from here on.

See also: ROOTFile, interped_data

UnROOT.basketarray — Method

basketarray(f::ROOTFile, path::AbstractString, ith)
basketarray(f::ROOTFile, branch::Union{TBranch, TBranchElement}, ith)
basketarray(lb::LazyBranch, ith)

Reads actual data from ith basket of a branch. This function first calls readbasket to obtain raw bytes and offsets of a basket, then calls auto_T_JaggT followed by interped_data to translate raw bytes into actual data.

UnROOT.basketarray_iter — Method

basketarray_iter(f::ROOTFile, branch::Union{TBranch, TBranchElement})
basketarray_iter(lb::LazyBranch)

Returns a Base.Generator yielding the output of basketarray() for all baskets.

UnROOT.chaintrees — Method

chaintrees(ts)

Chain a collection of LazyTrees together to form a larger tree, every tree should have identical branch names and types, we're not trying to re-implement SQL here.

Example

julia> typeof(tree)
LazyTree with 1 branches:
a

julia> tree2 = UnROOT.chaintrees([tree,tree]);

julia> eltype(tree.a) == eltype(tree2.a)
true

julia> length(tree)
100

julia> length(tree2)
200

julia> eltype(tree)
UnROOT.LazyEvent{NamedTuple{(:a,), Tuple{LazyBranch{Int32, UnROOT.Nojagg, Vector{Int32}}}}}

julia> eltype(tree2)
UnROOT.LazyEvent{NamedTuple{(:a,), Tuple{SentinelArrays.ChainedVector{Int32, LazyBranch{Int32, UnROOT.Nojagg, Vector{Int32}}}}}}

UnROOT.compressed_datastream — Method

compressed_datastream(io, tkey)

Extract all [compressionheader][rawbytes] from a TKey. This is an isolated function because we want to compartmentalize disk I/O as much as possible.

See also: decompress_datastreambytes

UnROOT.decompress_datastreambytes — Method

decompress_datastreambytes(compbytes, tkey)

Process the compressed bytes compbytes which was read out by compressed_datastream and pointed to from tkey. This function simply return uncompressed bytes according to the compression algorithm detected (or the lack of).

UnROOT.endcheck — Method

function endcheck(io, preamble::Preamble)

Checks if everything went well after parsing a TObject. Used in conjunction with Preamble.

UnROOT.interped_data — Method

interped_data(rawdata, rawoffsets, ::Type{T}, ::Type{J}) where {T, J<:JaggType}

The function thats interpret raw bytes (from a basket) into corresponding Julia data, based on type T and jagg type J.

In order to retrieve data from custom branches, user should defined more speialized method of this function with specific T and J. See TLorentzVector example.

UnROOT.interped_data — Method

interped_data(rawdata, rawoffsets, ::Type{Vector{LorentzVector{Float64}}}, ::Type{Offsetjagg})

The interped_data method specialized for LorentzVector. This method will get called by basketarray instead of the default method for TLorentzVector branch.

UnROOT.isvoid — Method

isvoid(::Type{T})

Internal function to determine (by only looking at the type) if a RNTuple field is recursively empty. A field is empty is there's no more data column attached to it from this point forward.

For example, the :_0 field is empty here:

├─ Symbol("AntiKt4TruthDressedWZJetsAux:") ⇒ Struct
│                                            ├─ :m ⇒ Vector
│                                            │       ├─ :offset ⇒ Leaf{UnROOT.Index64}(col=23)
│                                            │       └─ :content ⇒ Leaf{Float32}(col=24)
│                                            ├─ Symbol(":_0") ⇒ Struct
│                                            │                  ├─ Symbol(":_2") ⇒ Struct
│                                            │                  ├─ Symbol(":_1") ⇒ Struct
│                                            │                  ├─ Symbol(":_0") ⇒ Struct
│                                            │                  │                  └─ Symbol(":_0") ⇒ Struct
│                                            │                  └─ Symbol(":_3") ⇒ Struct

When we parse the schema, we discard anything that cannot possibly produce redable data.

UnROOT.parseTH — Method

parseTH(th::Dict{Symbol, Any}; raw=true) -> (counts, edges, sumw2, nentries)
parseTH(th::Dict{Symbol, Any}; raw=false) -> Union{FHist.Hist1D, FHist.Hist2D}

When raw=true, parse the output of TH into a tuple of counts, edges, sumw2, and nentries. When raw=false, parse the output of TH into FHist.jl histograms.

Example

julia> UnROOT.parseTH(UnROOT.samplefile("histograms1d2d.root")["myTH1D"])
([40.0, 2.0], (-2.0:2.0:2.0,), [800.0, 2.0], 4.0)

julia> UnROOT.parseTH(UnROOT.samplefile("histograms1d2d.root")["myTH1D"]; raw=false)
edges: -2.0:2.0:2.0
bin counts: [40.0, 2.0]
total count: 42.0

!!! note
TH1 and TH2 inputs are supported.

UnROOT.parsetobject — Method

Direct parsing of streamed objects which are not sitting on branches. This function needs to be rewritten, so that it can create proper types of TObject inherited data (like TVectorT<*>).

UnROOT.read_field — Method

read_field(io, field::F, page_list) where F

Read a field from the io stream. The page_list is a list of PageLinks for the current cluster group. The type stability is achieved by type asserting based on type F via _field_output_type function.

UnROOT.read_field — Method

read_field(io, field::StructField{N, T}, page_list) where {N, T}

Since each field of the struct is stored in a separate field of the RNTuple, this function returns a StructArray to maximize efficiency.

UnROOT.read_pagedesc — Method

read_pagedesc(io, pagedescs::AbstractVector{PageDescription}, cr::ColumnRecord)

Read the decompressed raw bytes given a Page Description. The nbits need to be provided according to the element type of the column since pagedesc only contains num_elements information.

Note

We handle split, zigzag, and delta encodings inside this function.

UnROOT.readbasket — Method

readbasket(f::ROOTFile, branch, ith)
readbasketseek(f::ROOTFile, branch::Union{TBranch, TBranchElement}, seek_pos::Int, nbytes)

The fundamental building block of reading read data from a .root file. Read one basket's raw bytes and offsets at a time. These raw bytes and offsets then (potentially) get processed by interped_data.

See also: auto_T_JaggT, basketarray

UnROOT.readobjany! — Method

function readobjany!(io, tkey::TKey, refs)

The main entrypoint where streamers are parsed and cached for later use. The refs dictionary holds the streamers or parsed data which are reused when already available.

UnROOT.rnt_ary_to_page — Method

rnt_ary_to_page(ary::AbstractVector, cr::ColumnRecord) end

Turns an AbstractVector into a page of an RNTuple. The element type must be primitive for this to work.

UnROOT.rnt_col_to_ary — Method

rnt_col_to_ary(col) -> Vector{Vector}

Normalize each user-facing "column" into a collection of Vector{<:Real} ready to be written to a page. After calling this on all user-facing "column", we should have as many arys as our ColumnRecords and in the same order.

UnROOT.rnt_write — Method

@SimpleStruct struct ClusterGroupRecord minimumentrynumber::Int64 entryspan::Int64 numclusters::Int32 pagelistlink::EnvLink end

UnROOT.skiptobj — Method

function skiptobj(io)

Skips a TObject.

UnROOT.splitup — Method

splitup(data::Vector{UInt8}, offsets, T::Type; skipbytes=0)

Given the offsets and data return by array(...; raw = true), reconstructed the actual array (with custom struct, can be jagged as well).

UnROOT.topological_sort — Method

function topological_sort(streamer_infos)

Sort the streamers with respect to their dependencies and keep only those which are not defined already.

The implementation is based on https://stackoverflow.com/a/11564769/1623645

UnROOT.unpack — Method

unpack(x::CompressionHeader)

Return the following information:

Name of compression algorithm
Level of the compression
compressedbytes and uncompressedbytes according to uproot3

UnROOT.@SimpleStruct — Macro

macro SimpleStruct

Define reading method on the fly for _rntuple_read

Example

julia> @SimpleStruct struct Locator
           num_bytes::Int32
           offset::UInt64
       end

would automatically define the following reading method:

function _rntuple_read(io, ::Type{Locator})
    num_bytes = _rntuple_read(io, Int32)
    offset = _rntuple_read(io, UInt64)
    Locator(num_bytes, offset)
end

Notice _rntuple_read falls back to read for all types that are not defined by us.

UnROOT.@stack — Macro

macro stack(into, structs...)

Stack the fields of multiple structs and create a new one. The first argument is the name of the new struct followed by the ones to be stacked. Parametric types are not supported and the fieldnames needs to be unique.

Example:

@stack Baz Foo Bar

Creates Baz with the concatenated fields of Foo and Bar