Commonly used
UnROOT.LazyBranch
— TypeLazyBranch(f::ROOTFile, branch)
Construct an accessor for a given branch such that BA[idx]
and or BA[1:20]
is type-stable. And memory footprint is a single basket (<1MB usually). You can also iterate or map over it. If you want a concrete Vector
, simply collect()
the LazyBranch.
Example
julia> rf = ROOTFile("./test/samples/tree_with_large_array.root");
julia> b = rf["t1/int32_array"];
julia> ab = UnROOT.LazyBranch(rf, b);
julia> for entry in ab
@show entry
break
end
entry = 0
julia> ab[begin:end]
0
1
...
UnROOT.LazyTree
— MethodLazyTree(f::ROOTFile, s::AbstractString, branch::Union{AbstractString, Regex})
LazyTree(f::ROOTFile, s::AbstractString, branch::Vector{Union{AbstractString, Regex}})
Constructor for LazyTree
, which is close to an DataFrame
(interface wise), and a lazy Table (speed wise). Looping over a LazyTree
is fast and type stable. Internally, LazyTree
contains a typed table whose branch are LazyBranch
. This means that at any given time only N
baskets are cached, where N
is the number of branches.
Accessing with [start:stop]
will return a LazyTree
with concrete internal table.
Split branches are re-named, and the exact renaming may change. See Issue 156 for context.
Example
julia> mytree = LazyTree(f, "Events", ["Electron_dxy", "nMuon", r"Muon_(pt|eta)$"])
Row │ Electron_dxy nMuon Muon_eta Muon_pt
│ Vector{Float32} UInt32 Vector{Float32} Vector{Float32}
─────┼───────────────────────────────────────────────────────────
1 │ [0.000371] 0 [] []
2 │ [-0.00982] 2 [0.53, 0.229] [19.9, 15.3]
3 │ [] 0 [] []
4 │ [-0.00157] 0 [] []
⋮ │ ⋮ ⋮ ⋮ ⋮
UnROOT.LazyTree
— Methodfunction LazyTree(f::ROOTFile, tree::TTree, treepath, branches; sink = LazyTree)
Creates a lazy tree object of the selected branches only. branches
is vector of String
, Regex
or Pair{Regex, SubstitutionString}
, where the first item is the regex selector and the second item the rename pattern. An alternative container can be used by providing a sink function. The sink function must take as argument an table with a Tables.jl interface. The table columns are filled with LazyBranch objects.
More Internal
UnROOT.Cursor
— TypeThe Cursor
type is embedded into Branches of a TTree such that when we need to read the content of a Branch, we don't need to go through the Directory and find the TKey and then seek to where the Branch is.
The io
inside a Cursor
is in fact only a buffer, it is NOT a io
that refers to the whole file's stream.
UnROOT.LeafField
— Typestruct LeafField{T}
content_col_idx::Int
columnrecord::ColumnRecord
end
Base case of field nesting, this links to a column in the RNTuple by 0-based index. T
is the eltype
of this field which mostly uses Julia native types except for Switch
.
The type
field is the RNTuple spec type number, used to record split encoding.
UnROOT.OffsetBuffer
— TypeOffsetBuffer
Works with seek, position of the original file. Think of it as a view of IOStream that can be indexed with original positions.
UnROOT.Preamble
— MethodReads the preamble of an object.
The cursor will be put into the right place depending on the data.
UnROOT.RNTuple
— TypeRNTuple
This is the struct for holding all metadata (schema) needed to completely describe and RNTuple from ROOT, just like TTree
, to obtain a table-like data object, you need to use LazyTree
explicitly:
Example
julia> f = ROOTFile("./test/samples/RNTuple/test_ntuple_stl_containers.root");
julia> f["ntuple"]
UnROOT.RNTuple:
header:
name: "ntuple"
ntuple_description: ""
writer_identifier: "ROOT v6.29/01"
schema:
RNTupleSchema with 13 top fields
├─ :lorentz_vector ⇒ Struct
├─ :vector_tuple_int32_string ⇒ Vector
├─ :string ⇒ String
├─ :vector_string ⇒ Vector
...
..
.
julia> LazyTree(f, "ntuple")
Row │ string vector_int32 array_float vector_vector_i vector_string vector_vector_s variant_int32_s vector_variant_ ⋯
│ String Vector{Int32} StaticArraysCor Vector{Vector{I Vector{String} Vector{Vector{S Union{Int32, St Vector{Union{In ⋯
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ one [1] [1.0, 1.0, 1.0] Vector{Int32}[Int3 ["one"] [["one"]] 1 Union{Int64, Strin ⋯
2 │ two [1, 2] [2.0, 2.0, 2.0] Vector{Int32}[Int3 ["one", "two"] [["one"], ["two"]] two Union{Int64, Strin ⋯
3 │ three [1, 2, 3] [3.0, 3.0, 3.0] Vector{Int32}[Int3 ["one", "two", "th [["one"], ["two"], three Union{Int64, Strin ⋯
4 │ four [1, 2, 3, 4] [4.0, 4.0, 4.0] Vector{Int32}[Int3 ["one", "two", "th [["one"], ["two"], 4 Union{Int64, Strin ⋯
5 │ five [1, 2, 3, 4, 5] [5.0, 5.0, 5.0] Vector{Int32}[Int3 ["one", "two", "th [["one"], ["two"], 5 Union{Int64, Strin ⋯
5 columns omitted
UnROOT.RNTupleCardinality
— Typestruct RNTupleCardinality{T}
content_col_idx::Int
nbits::Int
end
Special field. The cardinality is basically a counter, but the data column is a leaf column of Index32 or Index64. To get a number from Cardinality, one needs to compute ary[i] - ary[i-1]
.
UnROOT.RNTupleField
— Typemutable struct RNTupleField{R, F, O, E} <: AbstractVector{E}
Not a counterpart of RNTuple field in ROOT. This is a user-facing Julia-only construct like LazyBranch
that is meant to act like a lazy AbstractVector
backed with file IO source and a schema field from RNTuple.schema
.
R
is the type of parentRNTuple
F
is the type of the field in the schema- 'O' is the type of output when you read a cluster-worth of data
- 'E' is the element type of
O
(i.e. what you get for each event (row) in iteration)
UnROOT.RNTupleSchema
— Typestruct RNTupleSchema
A wrapper struct for print_tree
implementation of the schema display.
Example
julia> f = ROOTFile("./test/samples/RNTuple/test_ntuple_stl_containers.root");
julia> f["ntuple"].schema
RNTupleSchema with 13 top fields
├─ :lorentz_vector ⇒ Struct
│ ├─ :pt ⇒ Leaf{Float32}(col=26)
│ ├─ :eta ⇒ Leaf{Float32}(col=27)
│ ├─ :phi ⇒ Leaf{Float32}(col=28)
│ └─ :mass ⇒ Leaf{Float32}(col=29)
├─ :vector_tuple_int32_string ⇒ Vector
│ ├─ :offset ⇒ Leaf{Int32}(col=9)
│ └─ :content ⇒ Struct
│ ├─ :_1 ⇒ String
│ │ ├─ :offset ⇒ Leaf{Int32}(col=37)
│ │ └─ :content ⇒ Leaf{Char}(col=38)
│ └─ :_0 ⇒ Leaf{Int32}(col=36)
├─ :string ⇒ String
│ ├─ :offset ⇒ Leaf{Int32}(col=1)
│ └─ :content ⇒ Leaf{Char}(col=2)
├─ :vector_string ⇒ Vector
│ ├─ :offset ⇒ Leaf{Int32}(col=5)
│ └─ :content ⇒ String
│ ├─ :offset ⇒ Leaf{Int32}(col=13)
│ └─ :content ⇒ Leaf{Char}(col=14)
...
..
.
UnROOT.ROOTFile
— MethodROOTFile(filename::AbstractString; customstructs = Dict("TLorentzVector" => LorentzVector{Float64}))
ROOTFile
's constructor from a file. The customstructs
dictionary can be used to pass user-defined struct as value and its corresponding fClassName
(in Branch) as key such that UnROOT
will know to interpret them, see interped_data
.
See also: LazyTree
, LazyBranch
Example
julia> f = ROOTFile("test/samples/NanoAODv5_sample.root")
ROOTFile with 2 entries and 21 streamers.
test/samples/NanoAODv5_sample.root
└─ Events
├─ "run"
├─ "luminosityBlock"
├─ "event"
├─ "HTXS_Higgs_pt"
├─ "HTXS_Higgs_y"
└─ "⋮"
UnROOT.StdArrayField
— TypeStdArrayField<N, T>
Special base-case field for a leaf field representing std::array<T, N>
. This is because RNTuple would serialize it as a leaf field but with flags == 0x0001
in the field description. In total, there are two field descriptions associlated with array<>
, one for meta-data (the N
), the other one for the actual data.
UnROOT.Streamers
— Methodfunction Streamers(io)
Reads all the streamers from the ROOT source.
UnROOT.StringField
— TypeStringField
Special base-case field for String leaf field. This is because RNTuple splits a leaf String field into two columns (instead of split in field records). So we need an offset column and a content column (that contains Char
s).
UnROOT.TH
— MethodTH(io, tkey::TKey, refs)
Internal function used to form a fields = Dict{Symbol, Any}()
that represents the fields of a TH
(histogram) in C++ ROOT.
UnROOT._field_output_type
— Method_field_output_type(::Type{F}) where F
This is function is used in two ways:
- provide a output type prediction for each "field" in RNTuple so we can
achieve type stability
- it's also used to enforce the type stability in
read_field
:
# this is basically a type assertion for `res`
return res::_field_output_type(field)
UnROOT._rntuple_clusterrange
— MethodThe event number range a given cluster covers, in Julia's index
UnROOT.array
— Methodarray(f::ROOTFile, path; raw=false)
Reads an array from a branch. Set raw=true
to return raw data and correct offsets.
UnROOT.arrays
— Methodarrays(f::ROOTFile, treename)
Reads all branches from a tree.
UnROOT.auto_T_JaggT
— Methodauto_T_JaggT(f::ROOTFile, branch; customstructs::Dict{String, Type})
Given a file and branch, automatically return (eltype, Jaggtype). This function is aware of custom structs that are carried with the parent ROOTFile
.
This is also where you may want to "redirect" classname -> Julia struct name, for example "TLorentzVector" => LorentzVector
here and you can focus on LorentzVectors.LorentzVector
methods from here on.
See also: ROOTFile
, interped_data
UnROOT.basketarray
— Methodbasketarray(f::ROOTFile, path::AbstractString, ith)
basketarray(f::ROOTFile, branch::Union{TBranch, TBranchElement}, ith)
basketarray(lb::LazyBranch, ith)
Reads actual data from ith basket of a branch. This function first calls readbasket
to obtain raw bytes and offsets of a basket, then calls auto_T_JaggT
followed by interped_data
to translate raw bytes into actual data.
UnROOT.basketarray_iter
— Methodbasketarray_iter(f::ROOTFile, branch::Union{TBranch, TBranchElement})
basketarray_iter(lb::LazyBranch)
Returns a Base.Generator
yielding the output of basketarray()
for all baskets.
UnROOT.chaintrees
— Methodchaintrees(ts)
Chain a collection of LazyTree
s together to form a larger tree, every tree should have identical branch names and types, we're not trying to re-implement SQL here.
Example
julia> typeof(tree)
LazyTree with 1 branches:
a
julia> tree2 = UnROOT.chaintrees([tree,tree]);
julia> eltype(tree.a) == eltype(tree2.a)
true
julia> length(tree)
100
julia> length(tree2)
200
julia> eltype(tree)
UnROOT.LazyEvent{NamedTuple{(:a,), Tuple{LazyBranch{Int32, UnROOT.Nojagg, Vector{Int32}}}}}
julia> eltype(tree2)
UnROOT.LazyEvent{NamedTuple{(:a,), Tuple{SentinelArrays.ChainedVector{Int32, LazyBranch{Int32, UnROOT.Nojagg, Vector{Int32}}}}}}
UnROOT.compressed_datastream
— Methodcompressed_datastream(io, tkey)
Extract all [compressionheader][rawbytes] from a TKey
. This is an isolated function because we want to compartmentalize disk I/O as much as possible.
See also: decompress_datastreambytes
UnROOT.decompress_datastreambytes
— Methoddecompress_datastreambytes(compbytes, tkey)
Process the compressed bytes compbytes
which was read out by compressed_datastream
and pointed to from tkey
. This function simply return uncompressed bytes according to the compression algorithm detected (or the lack of).
UnROOT.endcheck
— Methodfunction endcheck(io, preamble::Preamble)
Checks if everything went well after parsing a TObject. Used in conjunction with Preamble
.
UnROOT.interped_data
— Methodinterped_data(rawdata, rawoffsets, ::Type{T}, ::Type{J}) where {T, J<:JaggType}
The function thats interpret raw bytes (from a basket) into corresponding Julia data, based on type T
and jagg type J
.
In order to retrieve data from custom branches, user should defined more speialized method of this function with specific T
and J
. See TLorentzVector
example.
UnROOT.interped_data
— Methodinterped_data(rawdata, rawoffsets, ::Type{Vector{LorentzVector{Float64}}}, ::Type{Offsetjagg})
The interped_data
method specialized for LorentzVector
. This method will get called by basketarray
instead of the default method for TLorentzVector
branch.
UnROOT.isvoid
— Methodisvoid(::Type{T})
Internal function to determine (by only looking at the type) if a RNTuple field is recursively empty. A field is empty is there's no more data column attached to it from this point forward.
For example, the :_0 field is empty here:
├─ Symbol("AntiKt4TruthDressedWZJetsAux:") ⇒ Struct
│ ├─ :m ⇒ Vector
│ │ ├─ :offset ⇒ Leaf{UnROOT.Index64}(col=23)
│ │ └─ :content ⇒ Leaf{Float32}(col=24)
│ ├─ Symbol(":_0") ⇒ Struct
│ │ ├─ Symbol(":_2") ⇒ Struct
│ │ ├─ Symbol(":_1") ⇒ Struct
│ │ ├─ Symbol(":_0") ⇒ Struct
│ │ │ └─ Symbol(":_0") ⇒ Struct
│ │ └─ Symbol(":_3") ⇒ Struct
When we parse the schema, we discard anything that cannot possibly produce redable data.
UnROOT.parseTH
— MethodparseTH(th::Dict{Symbol, Any}; raw=true) -> (counts, edges, sumw2, nentries)
parseTH(th::Dict{Symbol, Any}; raw=false) -> Union{FHist.Hist1D, FHist.Hist2D}
When raw=true
, parse the output of TH
into a tuple of counts
, edges
, sumw2
, and nentries
. When raw=false
, parse the output of TH
into FHist.jl histograms.
Example
julia> UnROOT.parseTH(UnROOT.samplefile("histograms1d2d.root")["myTH1D"])
([40.0, 2.0], (-2.0:2.0:2.0,), [800.0, 2.0], 4.0)
julia> UnROOT.parseTH(UnROOT.samplefile("histograms1d2d.root")["myTH1D"]; raw=false)
edges: -2.0:2.0:2.0
bin counts: [40.0, 2.0]
total count: 42.0
!!! note
TH1 and TH2 inputs are supported.
UnROOT.parsetobject
— MethodDirect parsing of streamed objects which are not sitting on branches. This function needs to be rewritten, so that it can create proper types of TObject inherited data (like TVectorT<*>
).
UnROOT.read_field
— Methodread_field(io, field::F, page_list) where F
Read a field from the io
stream. The page_list
is a list of PageLinks for the current cluster group. The type stability is achieved by type asserting based on type F
via _field_output_type
function.
UnROOT.read_field
— Methodread_field(io, field::StructField{N, T}, page_list) where {N, T}
Since each field of the struct is stored in a separate field of the RNTuple, this function returns a StructArray
to maximize efficiency.
UnROOT.read_pagedesc
— Methodread_pagedesc(io, pagedescs::AbstractVector{PageDescription}, cr::ColumnRecord)
Read the decompressed raw bytes given a Page Description. The nbits
need to be provided according to the element type of the column since pagedesc
only contains num_elements
information.
We handle split, zigzag, and delta encodings inside this function.
UnROOT.readbasket
— Methodreadbasket(f::ROOTFile, branch, ith)
readbasketseek(f::ROOTFile, branch::Union{TBranch, TBranchElement}, seek_pos::Int, nbytes)
The fundamental building block of reading read data from a .root file. Read one basket's raw bytes and offsets at a time. These raw bytes and offsets then (potentially) get processed by interped_data
.
See also: auto_T_JaggT
, basketarray
UnROOT.readobjany!
— Methodfunction readobjany!(io, tkey::TKey, refs)
The main entrypoint where streamers are parsed and cached for later use. The refs
dictionary holds the streamers or parsed data which are reused when already available.
UnROOT.rnt_ary_to_page
— Methodrnt_ary_to_page(ary::AbstractVector, cr::ColumnRecord) end
Turns an AbstractVector into a page of an RNTuple. The element type must be primitive for this to work.
UnROOT.rnt_col_to_ary
— Methodrnt_col_to_ary(col) -> Vector{Vector}
Normalize each user-facing "column" into a collection of Vector{<:Real} ready to be written to a page. After calling this on all user-facing "column", we should have as many ary
s as our ColumnRecord
s and in the same order.
UnROOT.rnt_write
— Method@SimpleStruct struct ClusterGroupRecord minimumentrynumber::Int64 entryspan::Int64 numclusters::Int32 pagelistlink::EnvLink end
UnROOT.skiptobj
— Methodfunction skiptobj(io)
Skips a TObject.
UnROOT.splitup
— Methodsplitup(data::Vector{UInt8}, offsets, T::Type; skipbytes=0)
Given the offsets
and data
return by array(...; raw = true)
, reconstructed the actual array (with custom struct, can be jagged as well).
UnROOT.topological_sort
— Methodfunction topological_sort(streamer_infos)
Sort the streamers with respect to their dependencies and keep only those which are not defined already.
The implementation is based on https://stackoverflow.com/a/11564769/1623645
UnROOT.unpack
— Methodunpack(x::CompressionHeader)
Return the following information:
- Name of compression algorithm
- Level of the compression
- compressedbytes and uncompressedbytes according to uproot3
UnROOT.@SimpleStruct
— Macromacro SimpleStruct
Define reading method on the fly for _rntuple_read
Example
julia> @SimpleStruct struct Locator
num_bytes::Int32
offset::UInt64
end
would automatically define the following reading method:
function _rntuple_read(io, ::Type{Locator})
num_bytes = _rntuple_read(io, Int32)
offset = _rntuple_read(io, UInt64)
Locator(num_bytes, offset)
end
Notice _rntuple_read
falls back to read
for all types that are not defined by us.
UnROOT.@stack
— Macromacro stack(into, structs...)
Stack the fields of multiple structs and create a new one. The first argument is the name of the new struct followed by the ones to be stacked. Parametric types are not supported and the fieldnames needs to be unique.
Example:
@stack Baz Foo Bar
Creates Baz
with the concatenated fields of Foo
and Bar