Lazy snapshot accessor
this is untested, you can help by testing it
maybe iteration would also be nice, but then it'd be better to keep the file handle open? idk
Merge request reports
Activity
- Resolved by Lukas Hupe
the reason why I'm worried about iteration is that if we don't implement it, people will do
[Snapshots(df)[i] for i in 0:lastindex(Snapshots(df))]
or something.Edited by Lukas Hupe
- Resolved by Lukas Hupe
Looks good! I am wondering if it's good to break Julia convention by using 0-based indexing though. For the old ClusterTools snapshot indexing, I had shifted everything by one (basically,
simstate(..., snapidx)
was the original, whereasgetindex(..., idx) = simstate(..., idx-1)
was defined the Julian way). I am not sure what's best though.Iteration sounds like a good idea, and
length
probably, too.
- Resolved by Lukas Hupe
Are result files allowed not to contain any snapshots? If so, then
numsnaps = 0
will lead to_lastsnap = -1
, the same as the nothing-cached-yet magic value. So perhaps use-2
ortypemin(Int64)
for that?
- Resolved by Lukas Hupe
this should also be in v0.4.1
- Resolved by Lukas Hupe
67 67 end 68 68 end 69 69 70 ## Lazy snapshot accessor 71 72 struct Snapshots 73 path::String 74 _lastsnap::Ref{Int} 75 # use -2 as magic value to distinguish empty files (numsnaps = 0) 76 Snapshots(path::String; lastsnap = -2) = new(path, Ref(lastsnap)) 77 end 78 79 # indexed access 80 Base.getindex(sn::Snapshots, i::Int) = InPartS.dfopen(sn.path, "r") do df 81 InPartS.readsim(df, snap = i) 82 end - Comment on lines +80 to +82
103 Base.getindex(sn::Snapshots, i::Int) = InPartS.dfopen(sn.path, "r") do df 104 InPartS.readsim(df, snap = i) 105 end 103 function Base.getindex(sn::Snapshots, i::Int) 104 if !(firstindex(sn) ≤ i ≤ lastindex(sn)) 105 throw(BoundsError("attempt to access snapshot at index [$i], valid snapshot indices are $(firstindex(sn)):$(lastindex(sn))")) 106 end 107 readsim(sn.path, snap = i) 108 end how does this interact with
@inbounds
? Should this somehow be tagged or can we harness an automatic check against firstindex/lastindex that would be disabled correctly?But given that this would open the file and set
_lastsnap
anyway, I am wondering if we should just do that on construction (it would also test automatically whether the file exists).again, opening the file on construction is against the design idea that I had in mind, where the file is only opened on accessing the struct
we could in theory add the
@boundscheck
, but why should we? it's not like the check is at all relevant for perfomance (like Jonas said while I was typing this)yes, it calls numsnaps earlier, you are right. but when you're loading a huge snapshot with 10^6 cells in it immediately afterwards, most of the time will be spent doing that
and yes, with that logic there's no real need to store it in the struct (other than the potential expansion to slicing support mentioned below), but it also doesn't cost us anything, so I did it anyway
ok, I thought the price we pay by making the upper bound somewhat non-lazy (think of snapshots getting added after the file is accessed the first time and numsnaps is cached) but in a side-effect kind of way was a weird intermediate solution, but in the end I don't care, just wanted to mention it.
- Resolved by Lukas Hupe
one more comment: if we ever decide to support slicing, it would have to return another
Snapshots
struct (which would have to be modified to support a non-zero firstindex) so that a syntax like[analysis(snap) for snap in Snapshots(file)[34:end-6]]
doesn't greedily load all the snapshots (e.g. in case the snapshots are huge and memory is limited)
added 2 commits
- Resolved by Philip Bittihn
Just a quick question in terms of code organization: Do we have other examples of globals in individual
.jl
files or are all others inconstants.jl
? ShouldIOWARN
be, too?
mentioned in commit f13a4b62