Skip to content

46 adding catalogs section in the pyaml configuration file#192

Draft
gupichon wants to merge 29 commits intomainfrom
46-adding-catalogs-section-in-the-pyaml-configuration-file
Draft

46 adding catalogs section in the pyaml configuration file#192
gupichon wants to merge 29 commits intomainfrom
46-adding-catalogs-section-in-the-pyaml-configuration-file

Conversation

@gupichon
Copy link
Copy Markdown
Contributor

Description

All references to the control system are moved to a catalog section.

Related Issue

Features/issues described there are:

  • new feature: was implemented in the following way... because...
  • bugfix: was implemented in the following way... because...
  • ...

Changes to existing functionality

Describe the changes that had to be made to an existing functionality (if they were made)

  • First change: reimplemented in the following way... because
  • Second change: reimplemented in the following way... because
  • ...

Testing

The following tests (compatible with pytest) were added:

  • first test
  • second test
  • ...

Verify that your checklist complies with the project

  • New and existing unit tests pass locally
  • Tests were added to prove that all features/changes are effective
  • The code is commented where appropriate
  • Any existing features are not broken (unless there is an explicit change to an existing functionality)

@gupichon gupichon self-assigned this Feb 12, 2026
@gupichon gupichon linked an issue Feb 12, 2026 that may be closed by this pull request
@JeanLucPons
Copy link
Copy Markdown
Contributor

JeanLucPons commented Feb 12, 2026

As previoulsy discussed, it would be nice, in order to minimize impact, and allow evolution of BPM model that you restore the DeviceAccess ref in the model and change:

        if catalog.has_reference(name + "/tilt"):
            return catalog.get_one(name + "/tilt")

to

        if catalog.has_reference(name + "/" + self._cfg.tilt):
            return catalog.get_one(name + "/" + self._cfg.tilt)

and same for positions and offsets.
otherwise the re-update of unit tests and examples will be a pain.

And i expect possible individual position and offset, so catalog.get_one() at every place.

@gupichon
Copy link
Copy Markdown
Contributor Author

As previoulsy discussed, it would be nice, in order to minimize impact, and allow evolution of BPM model that you restore the DeviceAccess ref in the model and change:

        if catalog.has_reference(name + "/tilt"):
            return catalog.get_one(name + "/tilt")

to

        if catalog.has_reference(name + "/" + self._cfg.tilt):
            return catalog.get_one(name + "/" + self._cfg.tilt)

and same for positions and offsets. otherwise the re-update of unit tests and examples will be a pain.

And i expect possible individual position and offset, so catalog.get_one() at every place.

Do you mean having x_offset and y_offset, x_pos and y_pos, and so on?

@gupichon
Copy link
Copy Markdown
Contributor Author

I've created the draft pull request, it will be easier to discuss here than under a commit.

@JeanLucPons
Copy link
Copy Markdown
Contributor

Yes but as string so we can use the catalog.
The BPM model will evolve and we will definitely add some model for position calculation from BPM button for instance, using a more accurate model than the classical DoS.
That means that instead of get_pos_devices() in the model you can have a get_button_devices() for instance.

@JeanLucPons
Copy link
Copy Markdown
Contributor

Thanks
It would be nice to make them optional. Otherwise i cannot comfigure a SimpleBPM with only one position.

x_pos: str | None = None
y_pos: str | None = None

@gupichon
Copy link
Copy Markdown
Contributor Author

gupichon commented Feb 12, 2026

Thanks It would be nice to make them optional. Otherwise i cannot comfigure a SimpleBPM with only one position.

x_pos: str | None = None
y_pos: str | None = None

Since the existence of x_pos and y_pos will be the general case, and a single position the exception, wouldn’t it be better to have a field specifying which axes are available, with both as the default? This would avoid having to specify x_pos and y_pos everywhere, except for a few BPMs. Same for offsets.

available_axes: list[str] = ["x", "y"]
x_pos: str = "x_pos"
x_pos: str = "x_pos"

@JeanLucPons
Copy link
Copy Markdown
Contributor

If you configure only x_pos the you have only x, no need of an extra flag.
If you want to optimize access to the catalog with get_many() then you can easily reconstruct this flag internally.

@gupichon
Copy link
Copy Markdown
Contributor Author

No, I just want to make the file shorter and more readable.

@gupichon
Copy link
Copy Markdown
Contributor Author

gupichon commented Feb 12, 2026

It would look like:

devices:
- type: pyaml.bpm.bpm
  name: BPM_C01-01
  model:
    type: pyaml.bpm.bpm_tiltoffset_model
- type: pyaml.bpm.bpm
  name: BPM_C01-02
  model:
    type: pyaml.bpm.bpm_simple_model
    available_axes: [x]

@JeanLucPons
Copy link
Copy Markdown
Contributor

JeanLucPons commented Feb 12, 2026

For me this is counter intuitive because i don't know what to put in the catalog.
You will have to create hidden keyword such as pos/off and tilt.
Last but not least BPMSimpleModel will be replaced by BPMTiltOffsetModel with optional tilt and offset.

Honestly I really would prefer to have a simple string in the the x_pos attribute that directly refers to the catalog even independently of the name of the PyAML top level object.

@gupichon
Copy link
Copy Markdown
Contributor Author

As you wish then

@JeanLucPons
Copy link
Copy Markdown
Contributor

At this end i would like to be able to do:

- type: pyaml.bpm.bpm
  name: BPM_C21-09
  model:
    type: pyaml.bpm.bpm_simple_model
    x_pos: srdiag/bpm/c21-09/SA_HPosition
    y_pos: srdiag/bpm/c21-09/SA_VPosition
    x_offset: srdiag/bpm/c21-09/HOffset
    y_offset: srdiag/bpm/c21-09/VOffset

and later:

- type: pyaml.bpm.bpm
  name: BPM_C21-09
  model:
    type: pyaml.bpm.bpm_simple_model
    x_pos: srdiag/bpm/c21-09/SA_HPosition
    y_pos: srdiag/bpm/c21-09/SA_VPosition
    x_offset: srdiag/bpm/c21-09/HOffset
    y_offset: srdiag/bpm/c21-09/VOffset
    incoherency:  srdiag/bpm/c21-09/Incoherency

or

- type: pyaml.bpm.bpm
  name: BPM_C21-09
  model:
    type: pyaml.bpm.bpm_button_model
    va: srdiag/bpm/c21-09/SA_VA
    vb: srdiag/bpm/c21-09/SA_VB
    vc: srdiag/bpm/c21-09/SA_VC
    vd: srdiag/bpm/c21-09/SA_DC
    K: [1.001,0.9956,1.019287,1.0021]

Then srdiag/bpm/c21-09/SA_HPosition will look in the catalog for the CS backend.
That would be nice.

@gupichon
Copy link
Copy Markdown
Contributor Author

Actually, this can already work. We can develop a specific Tango catalog that builds the DeviceAccess object based on the key used to query it, which would simply be the device name.

@gupichon
Copy link
Copy Markdown
Contributor Author

You can even go further and have the control system register itself in the catalog. Such a catalog could then check the Tango database to see whether a device exists, fetch its units, and so on. In that case, you would not even need to list the devices anymore.
However, this would reintroduce the naming convention at the device level again.

@JeanLucPons
Copy link
Copy Markdown
Contributor

You can even go further and have the control system register itself in the catalog. Such a catalog could then check the Tango database to see whether a device exists, fetch its units, and so on. In that case, you would not even need to list the devices anymore. However, this would reintroduce the naming convention at the device level again.

This is not really the goal, you have still in your catalog all the configuration of your underlying device which can be Epics or Tango. I recall that, for instance, the type is needed with pyaml-cs-oa. For the time being pyaml-cs-oa assumes that all control system variables are Float64 or Float64 array. We don't have this issue with native Tango backend (this is why attach_array() and attach() have the same implementation in native Tango backend). Last but not least for a RW variable, in Epics you have 2 underlying PV.

For x_pos, i expect a random string that will refer to one RO Tango attribute or Epics PV.

I like the idea of this Catalog. You have just a simple string for your device config at the Element (or Model) level.

@gupichon
Copy link
Copy Markdown
Contributor Author

If it's ok for you I continue with the rest of pyaml.

@JeanLucPons
Copy link
Copy Markdown
Contributor

You already rewrote and tested the examples ?
I'll be off Tomorow and Monday.

@JeanLucPons
Copy link
Copy Markdown
Contributor

 def get_offset_devices(
        self, name: str, catalog: Catalog
    ) -> list[DeviceAccess | None]:

I don't understand why the name is needed there ?

@JeanLucPons
Copy link
Copy Markdown
Contributor

I’m currently looking into it.

The ability to change the underlying device of a reference does not introduce any additional complexity in the configuration file compared to the initial catalog proposal. It remains strictly the same.

It add an additional complexity compare to what we have now. So it does not go in the good direction.
However, I count on dynamic catalog to make DeviceAccess disappear completly from configuration.
I have some good news (I hope) from ophyd-async concerning type. I have to test.

@gupichon
Copy link
Copy Markdown
Contributor Author

@JeanLucPons, the bug causing the Tango host to be tripled has been fixed. We may need to discuss how attach_indexed is supposed to work, but with my latest commit, it should be fine.

Also, can you resolve the last conflict?

@JeanLucPons
Copy link
Copy Markdown
Contributor

OK now it works.
attach_indexed() tells the backend that an array (a DevDouble Spectrum if you prefer) is expected. It is directly link to #1228. If this issue is solved, then attach_indexed() can be removed.
I see tomorrow for conflicts.

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

From my perspective, the configuration file is simply one way (static) of initializing PyAML, among others. The catalog itself can remain dynamic. What is needed is an API to populate and update it, along with, through the use of decorators, the ability to switch references within high-level objects.

The primary goal of the catalog is to abstract away direct backend references from the user. Instead, users interact only with references.

For me that makes the catalog sound very similar to the AO (AcceleratorObject) in MML. I would say the AO is somewhat of an in-memory database of all the devices and their configuration and works well for the users without adding extra complexity. So you populate it once by reading in some static source and then you can dynamically change it if you want but those changes only live during that session unless you save the changes back to the static source at some point.

However, the reason why I think it works well in MML is because the only way to change the configuration of a device is to make the change in the AO. If it's possible in pyAML to change the configuration of a BPM directly on the BPM object or in the catalog I agree with @JeanLucPons that this makes things more confusing rather than simpler.

This is one of the reasons why I think creating all the objects for the elements directly after loading the config might not be the best solution. For me that always felt like it creates a lot of long-lived object that you need to manage to keep up-to-date if you implement dynamic changes of the config. I thought the dynamic config would be easier to implement if there was some central object that stores all config data (like the MML AO or the catalog) and then the objects for the elements are created when the user needs them. So sr.live.get_bpm("BPM_C21-09") would not return an object that already exists but rather go to the catalog, take the information that is there at that point and create the object. The user can then do:

bpm1 = sr.live.get_bpm("BPM_C21-09")

# Here the user changes the catalog entry using some API

bpm2 = sr.live.get_bpm("BPM_C21-09")

to get two objects for the same BPM with different configs. I think that is easy to do as long as the API for the catalog is straight-forward. Otherwise, how do you keep the live and virtual accelerator modes up-to-date with each other?

If I do sr.live.get_bpm("BPM_C21-09").x_pos = "srdiag/bpm/c21-09/SA_HPosition" what is changing at the moment? Is it only the BPM for the live mode or does it also change the corresponding entry in the catalog and propagate that to all the other objects that reference the same entry?

@JeanLucPons
Copy link
Copy Markdown
Contributor

JeanLucPons commented Mar 17, 2026

This what I would like to simplify by simply get rid off catalog or AO and just having one string that tell how to access the hardware. All reference factoring problems should not be seen by the user. If you do sr.live.get_bpm("BPM_C21-09").x_pos = "srdiag/bpm/c21-09/SA_HPosition" it changes only what you expect, the Tango attribute for BPM_C21-09. However all arrays or tools that reference BPM_C21-09 will now access this new Tango attribute.
If you use indexed BPM, then you should be able to do:

sr.live.get_bpms("BPM").x_pos = "ORBITCC:rdPos"
or
sr.live.get_bpms("BPM").x_pos = ["ORBITCC:rdPos"] * len(sr.live.get_bpm("BPM"))

sr.live.get_bpms("BPM").x_pos_index = [i*2 for i in range(len(sr.live.get_bpms("BPM"))]
sr.live.get_bpms("BPM").y_pos_index = [i*2+1 for i in range(len(sr.live.get_bpms("BPM"))]

@JeanLucPons
Copy link
Copy Markdown
Contributor

JeanLucPons commented Mar 18, 2026

To go a bit further, and to avoid confusion, it would even change property to *_name:

sr.live.get_bpms("BPM").x_pos_name = "ORBITCC:rdPos"
or
sr.live.get_bpms("BPM").x_pos_name = ["ORBITCC:rdPos"] * len(sr.live.get_bpms("BPM"))

sr.live.get_bpms("BPM").x_pos_index = [i*2 for i in range(len(sr.live.get_bpms("BPM"))]
sr.live.get_bpms("BPM").y_pos_index = [i*2+1 for i in range(len(sr.live.get_bpms("BPM"))]

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

This what I would like to simplify by simply get rid off catalog or AO and just having one string that tell how to access the hardware. All reference factoring problems should not be seen by the user. If you do sr.live.get_bpm("BPM_C21-09").x_pos = "srdiag/bpm/c21-09/SA_HPosition" it changes only what you expect, the Tango attribute for BPM_C21-09. However all arrays or tools that reference BPM_C21-09 will now access this new Tango attribute. If you use indexed BPM, then you should be able to do:

sr.live.get_bpms("BPM").x_pos = "ORBITCC:rdPos"
or
sr.live.get_bpms("BPM").x_pos = ["ORBITCC:rdPos"] * len(sr.live.get_bpm("BPM"))

sr.live.get_bpms("BPM").x_pos_index = [i*2 for i in range(len(sr.live.get_bpms("BPM"))]
sr.live.get_bpms("BPM").y_pos_index = [i*2+1 for i in range(len(sr.live.get_bpms("BPM"))]

For me it looks like that only changes the TANGO attribute for the live mode? Or it somehow will also change the TANGO attribute for all the other control modes the user potentially can have configured that also use same BPM? If it does that becomes confusing for the users, but if it doesn't it means the user need to do the same thing for all the modes despite that they in the original yaml config all were referring the same device.

Personally, I think that syntax is too complicated compared to how it's done in MML and I doubt the current MML user will like it. They are already complaining about the pyAML user interface being too complicated and MML being easier to use. I think the user interface would be simpler if we don't allow changing the configuration at the element level at all but instead have something like the catalog which is responsible for being the only source of information for configuration data and manage changes to it. I would rather extend the catalog to include everything in the config and not just the control system parts. The yellow pages functionality can then just be something that prints out the information that is in the catalog. That would go in the direction of the MML AO, which I think works well and is an architecture idea worth keeping.

@gupichon
Copy link
Copy Markdown
Contributor Author

gupichon commented Mar 18, 2026

Reading the recent discussions, I think we should address these topics more broadly. Are you available for a short Teams meeting this morning?

I suggest splitting the discussion into three parts:

  • Backend linking: managing dynamic links (DB) and static links (configuration files). Possibly with a user API for exploration? This might be challenging if the database is not very clean (e.g., contains test devices). That’s what I initially started here, but it ended up diverging a bit too much.
  • Handling link changes for high-level objects, and clarifying the distinction between generic high-level object definitions (modifiable and responsible for propagating changes) and their instantiations across all control systems (not changeable by the user).
  • A catalog of high-level object definitions, similar to Yellow Pages (with or without control system objects?).

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

I am. Only thing I have planned for today is to try to make some progress on the validator decorator.

If it would help I can show the MML AO and which parts of it I think are the ones that the users like and would be worth keeping in pyAML in some form. Obviously there also bad parts that should not be kept ;)

@JeanLucPons
Copy link
Copy Markdown
Contributor

For me it looks like that only changes the TANGO attribute for the live mode? Or it somehow will also change the TANGO attribute for all the other control modes the user potentially can have configured that also use same BPM? If it does that becomes confusing for the users, but if it doesn't it means the user need to do the same thing for all the modes despite that they in the original yaml config all were referring the same device.

It changes only for live, if you want to change for all lives, you may use the yellow pages later, which unfortunately returns names, at the moment :(

Personally, I think that syntax is too complicated compared to how it's done in MML and I doubt the current MML user will like it. They are already complaining about the pyAML user interface being too complicated and MML being easier to use. I think the user interface would be simpler if we don't allow changing the configuration at the element level at all but instead have something like the catalog which is responsible for being the only source of information for configuration data and manage changes to it. I would rather extend the catalog to include everything in the config and not just the control system parts. The yellow pages functionality can then just be something that prints out the information that is in the catalog. That would go in the direction of the MML AO, which I think works well and is an architecture idea worth keeping.

I propose a one line syntax. I don't understand your point, referencing is done by PyAML name and are already propagated. Why do you want one more layer ? It will complexify the config which is already too complex.

We have a maintainer meeting next Friday to discuss this and try to find an agreement.

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

Okay, then let's discuss it on Friday. @gubaidulinvadim Is there space to put it on the agenda? Maybe we need to postpone something and talk about this instead?

@gupichon
Copy link
Copy Markdown
Contributor Author

Ok, let’s discuss this on Friday.

@gubaidulinvadim
Copy link
Copy Markdown
Contributor

Normally, there's no space on the agenda. But we can give priority to this discussion. Then we have to postpone the measurement class proposal of @JeanLucPons or your report/discussion on the steering committee roadmap.

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

What about the set and wait functionality? According the roadmap, simplifying the configuration is now priority 1 and the hardware abstraction layer priority 2 so perhaps that could also be postponed for this discussion instead?

@gubaidulinvadim
Copy link
Copy Markdown
Contributor

The set and wait functionality is already postponed. The only two big items on the agenda are Measurement Class and discussion of the roadmap

@JeanLucPons
Copy link
Copy Markdown
Contributor

JeanLucPons commented Mar 18, 2026

I agree, simplifying the configuration is high priority.
For dynamic config, my proposal would be:

# Load a config with BPM having no x_pos,y_pos,x_pos_index,y_pos_index 
sr: Accelerator = Accelerator.load("MyACC.yaml")
sr.yellow_pages.get("BPM").x_pos = "ORBITCC:rdPos"
sr.yellow_pages.get("BPM").y_pos = "ORBITCC:rdPos"
sr.yellow_pages.get("BPM").x_pos_index = [i*2 for i in range(len(sr.yellow_pages.get("BPM"))]
sr.yellow_pages.get("BPM").y_pos_index = [i*2+1 for i in range(len(sr.yellow_pages.get("BPM"))]

And usage with sr.live should also stay valid.

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

When deciding on the syntax, I think we need to consider the python level of the users. sr.yellow_pages.get("BPM").x_pos_index = [i*2 for i in range(len(sr.yellow_pages.get("BPM"))] is one line but a line which is complicated to understand if you are a new python user.

The MML users tell me that they are already struggling with the sr.live.get_bpms("BPM").h.get() syntax because there are too many levels of attributes and no generic way to read and set a device independently of it's type like there is in MML. They have already started to ask if they can't have the getsp, setsp syntax instead because they find it easier to use.

@JeanLucPons
Copy link
Copy Markdown
Contributor

There is always a learning curve for all: python, pyaml, accelerator physics...
We will try to make them as rapid as possible.

Personally i do:

orbit = sr.live.get_bpms("BPM").positions

# Then i use
...
orb0 = orbit.get()
...
orb1 = orbit.get()

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

TeresiaOlsson commented Mar 18, 2026

I think already to remove a lot of the get_something functions as suggested by @GamelinAl would help. Something generic like sr.live.get("BPM") already makes the syntax easier to remember since it's the same independently of which element you are interested in. Or I think even better sr.live["BPM"] to make the syntax more similar to how you get elements in pyAT since these users need to learn both pyAT and pyAML at the same time.

@JeanLucPons
Copy link
Copy Markdown
Contributor

Please discuss this in #199.
Thanks

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

@gupichon I have been looking at HAPPI which I would say is somewhat of a similar idea as the catalog.

https://pcdshub.github.io/happi/v3.0.1/

If it helps, there might be some useful inspiration from looking at how they have implemented it. If I have understood it correctly, it's like a device database which either has a json file or a real database in the back. Sounded to me a bit similar to our approach with a yaml file or what you wanted to do with connection to the TANGO database.

What I liked about it is that they are thinking about every entry as something which contains the information about how to import and instantiate a specific Python object. They say this:

The basic HappiItem has entries that tell happi how to import and instantiate a specific Python object. The fields required for this are device_class, args, and kwargs. In short, the effect of loading this device would be to import device_class and instantiate it by way of device_class(*args, **kwargs).

Thinking about it like that at least made it easier for me to understand what the point is of the entry and how the attributes I put there is going to be used. I guess in our case type would be which DeviceAccess to create and then the rest of the fields are kwargs?

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

They are also really thinking about it like it's a database despite that it in some cases is just a json file. Here are some examples of the user interface: https://pcdshub.github.io/happi/v3.0.1/client.html#db-choice

I quite like that interface because it means that the yaml file is no longer a long confusing text file but just the backend for a database that the user isn't meant to look at directly. Instead they would browse, add and modify things in it through a python API. And that same Python API can also be used to create it from scratch for new users. Maybe that would solve our problem where every lab now need to write their own scripts to generate the yaml file? Now it's a bit like everyone needs to invent that API by themselves.

@gupichon
Copy link
Copy Markdown
Contributor Author

Thanks a lot, @TeresiaOlsson, this will be very helpful. I’m going to restart this development from scratch.

@JeanLucPons
Copy link
Copy Markdown
Contributor

You want to restart this PR from scratch ? Using happi ?
To me, after a very first glance, it seems that it could replace pydantic, if happi handle validation ?

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

I'm also exploring to make a similar implementation as in HAPPI because it seems like the way to go. I played with it a bit last week and started to like that approach more and more.

But I would not use HAPPI directly because it feels like a dependency we don't want to have. I don't think there is a big community around that package so it might not be long-term sustainable to depend on. I also think HAPPI doesn't do validation and some of the things they do could have been implemented easier if pydantic basemodels were used instead. The happyItem and containers look to me pretty much like they could have been basemodels.

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

@gupichon I pushed what I had been playing with so far into the branch happi-style-config. So far I only tried to do a very basic yaml and json backend to understand how that might work compared to how we use the config file now. No guarantee that it actually is working because I didn't come so far.

But at least my feeling was that we like this could have pretty much the same yaml file format as now.

@gubaidulinvadim
Copy link
Copy Markdown
Contributor

@TeresiaOlsson @JeanLucPons @gupichon I guess I'll put this discussion on the agenda for the meeting on Friday.

@TeresiaOlsson
Copy link
Copy Markdown
Contributor

I will be on holiday on Friday so I can't come. But if pyAML decides to do it in some other way I will keep developing it as an external package which would allow the EPICS labs to have some kind of device database that we can use to generate the config file for pyAML. At the moment it's just way too difficult for us since we don't have the TANGO database that we can extract information from. I think some EPICS labs might already have something like this but at HZB we currently only store this information inside the MML config which isn't great so we need to replace it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adding catalogs section in the pyaml configuration file

5 participants