-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add Filestore backend #25
Conversation
I'll attempt to describe my current sticking point so I can come back to it when I have more time. Our schema has everything inheriting from
I think this behavior all makes sense, given the ambiguity of these classes. I see two main options:
I tried looking around for apischema-voodoo that would let us use a tagged_union of a tagged_union, but didn't anything super simple. |
This has gone on long enough. I'll request reviews and give it another once-over tomorrow with fresher eyes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review, I'm out on Friday but I can continue next week if needed
path = os.path.expanduser(path) | ||
if not os.path.isabs(path): | ||
return os.path.abspath(os.path.join(basedir, path)) | ||
return path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This utils function is a bit weird to me. In a lot of cases it's not necessary to provide a basedir to figure out a full absolute path, and in many cases this input arg is ignored entirely. Maybe it has specific utility in suggesting directories to save in?
|
||
|
||
@pytest.fixture(scope='function') | ||
def sample_database() -> Root: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this at odds with the fixture defined in
def linac_backend(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they can coexist. Having multiple test backends can't hurt right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they both exist as in-memory databases for testing and if they disagree on how to set up such a database, it's possible that we can get confused later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fair. I would can get behind unifying how we set them up, rather than avoiding having multiple test backends
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support having multiple as long as the interface is consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
captured some thoughts in: #33
… object references
…led and lazy respectively
….swap_to_uuids signature
0ad071c
to
16014c6
Compare
An example serialized file if we remove Details
{
"meta_id": "a28cd77d-cc92-46cc-90cb-758f0f36f041",
"entries": [
{
"uuid": "dfce8fb3-0a13-4fb3-b34e-aa7752decf26",
"description": "parameter 1 in root",
"creation_time": "2024-05-16T23:52:28.879185+00:00",
"pv_name": "MY:MOTOR:mtr1.ACCL",
"abs_tolerance": null,
"rel_tolerance": null,
"readback": null,
"read_only": false
},
{
"uuid": "0c566d5a-5ec8-4e7c-9d65-625a0d44185e",
"description": "",
"creation_time": "2024-05-16T23:52:28.879216+00:00",
"pv_name": "MY:MOTOR:mtr1.ACCL",
"data": 2,
"status": 18,
"severity": 4,
"readback": null
},
{
"uuid": "24fc4d5d-545f-4b54-9a7d-233866f0cb0d",
"description": "collection 1 defining some motor fields",
"creation_time": "2024-05-16T23:52:28.879236+00:00",
"title": "collection 1",
"children": [
{
"uuid": "067e4eb3-b745-49b9-97ba-669c17b7a1a3",
"description": "motor field ACCL",
"creation_time": "2024-05-16T23:52:28.879277+00:00",
"pv_name": "MY:PREFIX:mtr1.ACCL",
"abs_tolerance": null,
"rel_tolerance": null,
"readback": null,
"read_only": false
},
{
"uuid": "c1ea420c-735f-4299-811c-4726863df69c",
"description": "motor field VELO",
"creation_time": "2024-05-16T23:52:28.879314+00:00",
"pv_name": "MY:PREFIX:mtr1.VELO",
"abs_tolerance": null,
"rel_tolerance": null,
"readback": null,
"read_only": false
},
{
"uuid": "68581b2c-a69e-4695-94ef-2e6de2c81605",
"description": "motor field PREC",
"creation_time": "2024-05-16T23:52:28.879350+00:00",
"pv_name": "MY:PREFIX:mtr1.PREC",
"abs_tolerance": null,
"rel_tolerance": null,
"readback": null,
"read_only": false
}
],
"tags": []
},
{
"uuid": "32e5d77c-350e-4842-807a-5e281f1a8023",
"description": "Snapshot 1 created from collection 1",
"creation_time": "2024-05-16T23:52:28.879255+00:00",
"title": "snapshot 1",
"origin_collection": null,
"children": [
{
"uuid": "f71c5363-a7e0-479e-b748-f034eb9f75f4",
"description": "",
"creation_time": "2024-05-16T23:52:28.879295+00:00",
"pv_name": "MY:PREFIX:mtr1.ACCL",
"data": 2,
"status": 18,
"severity": 4,
"readback": null
},
{
"uuid": "74405890-5fbb-4c44-916a-1ffc728d8e70",
"description": "",
"creation_time": "2024-05-16T23:52:28.879331+00:00",
"pv_name": "MY:PREFIX:mtr1.VELO",
"data": 2,
"status": 18,
"severity": 4,
"readback": null
},
{
"uuid": "0fec8587-6cc7-41e8-823f-11b43f3ed3d9",
"description": "",
"creation_time": "2024-05-16T23:52:28.879367+00:00",
"pv_name": "MY:PREFIX:mtr1.PREC",
"data": 6,
"status": 18,
"severity": 4,
"readback": null
}
],
"tags": [],
"meta_pvs": []
}
]
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more thoughts from EOD reviewing
data: Optional[AnyEpicsType] = None | ||
status: Status = Status.UDF | ||
severity: Severity = Severity.INVALID | ||
|
||
|
||
@dataclass | ||
class Setpoint(Value): | ||
class Setpoint(Entry): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is Value
used for after this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me there are Value
s that are not Setpoint
-Readback
pairs and I wanted to leave that option open. If people are ok with having readback=None
for that type of Value
, we can remove Value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer having Setpoint
and Readback
inherit from Value
, but I know that that's causing issues with your serialization. If we can't resolve that issue, then I'm fine with removing Value
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove Value then. I think it's unfortunate that I can't get the tagged-unions-of-tagged-unions to work, but from a usability standpoint I don't think there's much of a difference. It's not the most DRY code in the world but I'd argue there are more offensive things out there...
|
||
|
||
@pytest.fixture(scope='function') | ||
def sample_database() -> Root: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they both exist as in-memory databases for testing and if they disagree on how to set up such a database, it's possible that we can get confused later.
from superscore.type_hints import AnyEpicsType | ||
from superscore.utils import utcnow | ||
|
||
logger = logging.getLogger(__name__) | ||
_root_uuid = _root_uuid = UUID("a28cd77d-cc92-46cc-90cb-758f0f36f041") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the idea behind this root uuid? It looks like it is static and is placed into every file exported by the filestore backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is static. The concept behind Root
is to have a single entry in the database with this UUID. This entry holds the top level Entry
s, for constructing a tree-view for example.
Otherwise there would be no way to determine which Entry
s are at the root level, and distinguish them from child Entry
s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it stuck out as me as weird since I assumed the root uuid was a database unique identifier or something since it gets included in the serialized data file. I guess it's not a problem if every filestore database has an extra a28cd in it.
os.remove(temp_path) | ||
raise | ||
|
||
def _temp_path(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend looking at tempfile
in the standard lib, which provides similar functionality in a context manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of rolling our own here is that in the case of an interruption, the temp file will continue to exist and can be recovered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with tempfile.NamedTemporaryFile
, which seemed promising at first. I wasn't able to easily incorporate it into the backend, and there is some OS-dependent behavior I'd like to avoid.
This file auto-deletes itself after exiting the context manager, which seems convenient. Except for the fact that we attempt to move the temp file to the existing file location. After doing this the tmp file no longer exists and the context manager throws. Copy transactions are not guaranteed to be atomic, so I actually think we might have to roll our own here.
|
||
|
||
@pytest.fixture(scope='function') | ||
def sample_database() -> Root: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support having multiple as long as the interface is consistent.
data: Optional[AnyEpicsType] = None | ||
status: Status = Status.UDF | ||
severity: Severity = Severity.INVALID | ||
|
||
|
||
@dataclass | ||
class Setpoint(Value): | ||
class Setpoint(Entry): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer having Setpoint
and Readback
inherit from Value
, but I know that that's causing issues with your serialization. If we can't resolve that issue, then I'm fine with removing Value
.
…oad_and_store_context manager, fix some typos
Description
Still some work to be done, but I think I should open this up to review before I go too off the rails.
UUID
to potentially lazy fields.Motivation and Context
Much of this is inspired from lessons learned from happi, forgive any vestigial bits that have made their way in.
What's going on here?
Internally, the FilestoreBackend holds one data structure in addition to the
Root
:The entry cache is used to quickly grab an
Entry
or search without recursing through the tree (avoiding double hits). With each CRUD operation, the backend uses this cache to reconstruct theRoot
object, filling in uuid references where they exist.In the future, I expect us to keep a sort of uuid-link-map, mapping a uuid to other uuids that reference it.
When other backends delete an entry, we will need this information to make sure we dereference the other entries properly. However since the FilestoreBackend reconstructs the entire root with every modification, it's not strictly needed
Why did you add a
Root
object?In any backend context, we need some way to know what
Entry
s are at the top level of the tree structure. While in the Filestore backend case we might store theEntry
s in a "filled" state (with full objects as children recursively), if we ever want to flatten this structure we will need to keep track of the top-level items.A simple "Root" object (holding only a list of
Entry
s) seemed like a simple way to do this.Why did
@as_tagged_union
come back?It may not be totally needed (we could specify each of the
Entry
subclasses inRoot
), but I think it helps the readability of the serialized output. Also there are issues with Value -> Setpoint/Readback, the nested tagged-union has been causing issues with serialization round tripsUUIDs are back
Admittedly we could probably run through the entire filestore database every time we searched for an Entry, but I used this as a potential example of how we might implement entry caching, lazy loading, and tree reconstruction. The rough idea for the filestore backend side is to regenerate and save the json db with every action.
How Has This Been Tested?
Added some unit tests, more to come.
Where Has This Been Documented?
This PR
Pre-merge checklist
docs/pre-release-notes.sh
and created a pre-release documentation page