V4 Name Server

From EPICSWIKI
Revision as of 22:01, 26 May 2005 by KayKasemir (talk | contribs) (200000 record numbers)

Directory Server

EPICS ChannelAccess currently uses a broadcast mechanism for resolving channel names. Advantages:

  • Minimal if not zero initial setup.
  • At least initially very fast, because a client can send multiple name requests in one network packet, all servers receive it simultaneously and can answer to known names.

Disadvantages:

  • Unclear how to extend this service to redundant servers, archived data, ...
  • Currently, no wildcard searches.
  • Currently, no statistics on available number of channels, types of records, ...
  • Bigger installations run into problems with broadcast traffic:
    • All devices see the requests, even though only one server has a given channel name.
    • Continued searches for unresolved names to overcome UDP limitations.

As Larry Hoff pointed out, we really want a Directory server for EPICS, something beyond a name serve. A directory will not only map PV names to IP addresses of ChannelAccess servers, but also support

  • wildcard searches, reports on number of channels per IOC etc.
  • redundant servers
  • info about reduntant IOCs with the same PV, archived data for a PV
  • backup/restore locations for a PV
  • PV meta information like engineer-in-charge, CA security rules, ...

LDAP, Directory, RDB, Schema, Entries, dn, cn, dc, ...

The Lightweight Directory Access Protocol (LDAP) is a network protocol. It defines how to access data over the network, regardless of language or underlying data storage. In contrast, RDBs like Oracle or MySQL define the data storage and SQL dialect, while the network access to the data is then added almost as an afterthought.

LDAP is meant as a Directory for simple attribute-based information for fast, high volume lookup. Simple search filters are supported: Patterns, "and", "or". In contrast to RDBs, there is no support for arbitrary relations between entries, there are no transactions nor rollback.

There are open-source servers for LDAP, and client libraries for C, Perl, Java and other languages. MacOSX, Linux, Windows 2000 already include LDAP clients. MS ActiveDirectory uses LDAP.

LDAP requires the data to be organized into hierarchical Entries. Each Entry has a globally unique Distinguished Name (dn), and various Attributes, that is Type/Value pairs. The Schema defines which attributes an entry must have and which ones it might have. There are existing schemas for DNS-type data or personell information, including types like Common Name (cn) or Domain Component (dc), but one can add custom schemata. Values might include binary data, e.g. images.

Mapping EPICS PV Information to LDAP

Each LDAP entry must have a unique name (dn). EPICS PVs must be unique as well, but only within the search domain currently defined by the EPICS_CA_ADDR_LIST. It might make sense to synchronize LDAP servers across these boundaries, in which case the LDAP dn will have to include more than just the PV name. For example, one could have a PV "fred" in both the accelerator and office network of the SNS, and enter them into an SNS-wide LDAP directory like this:

fred.linac.sns.epics:  cas="linacioc1:5065"
fred.office.sns.epics: cas="testioc47.sns.ornl.gov:7987", description="Fred's test PV"

In this example, the directory includes the ChannelAccess server for both PVs, and in one case also a description.

While an EPICS-specific tool might present the data as shown above, most generic LDAP tools and APIs use the LDAP Data Interchange Format (LDIF). In LDIF, part of the above data could look like this:

dn: dc=epics
objectclass: dcObject
objectclass: organization
o: Root of all EPICS PVs
dc: epics
dn: dc=sns,dc=epics
# Details omitted; similar to previous entry
dn: dc=linac,dc=sns,dc=epics
objectclass: dcObject
objectclass: organization
o: Root of all EPICS PVs in the SNS Linac
dc: linac
dn: cn=fred,dc=linac,dc=sns,dc=epics
objectclass: ProcessVariable
cn: fred
cas: linacioc1:5065 

Some points I had to learn:

  • Each entry has an objectclass. The schema for "ProcessVariable" would for example require a "cn" attribute, with optional "cas", "description" and maybe other attributes.
    The schema would also determine if searches for PV names are case-sensitive or not.
  • The hierarchical path "fred.linac.sns.epics" translates into a dn "cn=fred,dc=linac,dc=sns,dc=epics" which lists the relative dn of the entry itself and all path elements.
  • Each path element must exists, so to bootstrap one has to create entries "dc=epics" and "dc=sns,dc=epics" etc., for which I used the pre-defined "organization" schema.
  • To locate the PV 'fred', any of the following LDAP searches would work:
    • Scope "base" with base "cn=fred,dc=linac,dc=sns,dc=epics".
    • Scope "one" with base "dc=linac,dc=sns,dc=epics" and filter "(cn=fred)".
    • Scope "one" with base "dc=linac,dc=sns,dc=epics" and filter "(cn=f*d)",
      though this might return more records which happen to match the pattern.
    • Scope "sub" with base "dc=sns,dc=epics" and filter "(cn=fred)",
      which would also return an entry with dn "cn=fred,dc=office,dc=sns,dc=epics".
    • There is a filter option "~=" for 'approximately equal'. Unclear what exactly defines approximate, but this might be very handy for detecing typos.

OpenLDAP

OpenLDAP [1] is an open-source LDAP client library for C and an LDAP server which can store the actual data in /dev/null, a BerkeleyDB, SQL, Perl, a remote LDAP server.

OpenLDAP includes authentication and encrypted (SSL) transport, replication (single master -> multiple slaves), and referrals from one server to another one which has more details.

EPICS PV Tests

Using OpenLDAP-stable-20050429 and perl-ldap-0.3202 on a 1.3 GHz PowerBook G4 with 780 MB RAM under MacOS X 10.3.8, I used a perl script to create PV entries, search them, and remove them. In addition, the 'ldapsearch' that comes with RedHat Enterprise AS 3 was used to check remote accessibillity.

When trying to list all PVs with an appropriate search pattern, the default server configuration limits the response to 500 answers. This was changed such that 'paged' searches which keep requesting data in increments of e.g. 500 entries are allowed to continue until all data is retrieved. From the command-line tool, that's done like this:

ldapsearch -x -LLL -E pr=10/noprompt -b 'dc=epics' '(cn=testpv*)'  cas

Performance

Based on the default OpenLDAP server config, PVs with names 'testrec0000000001', 'testrec0000000002' etc. were added, then looked up one-by-one via their exact name, then via a pattern 'testrec*', and finially deleted one by one.

In the first test, the individual PV search actually used an LDIF filter with exact match like 'testrec0000000001':

Adding   1000 records:  4 wallclock secs ( 2.29 usr +  0.14 sys =  2.43 CPU)
Locating 1000 records: 49 wallclock secs ( 4.10 usr +  0.21 sys =  4.31 CPU)
Match all        1000:  1 wallclock secs ( 1.18 usr +  0.02 sys =  1.20 CPU)
Delete   1000 records:  4 wallclock secs ( 1.35 usr +  0.14 sys =  1.49 CPU)

When locating individual records via the complete LDIF base without a filter, the search is much faster: Locating 1000 records: 5 wallclock secs ( 3.20 usr + 0.18 sys = 3.38 CPU)

When adding an "index cn pres,eq" to the server config, the results are the same for both the filter and the base search case. No difference observed when tripling the cachesize from the default of 1000. The behavior scales linearly with the number of channels, always a little over 200 additions or lookups per second:

Adding   10000 records: 47 wallclock secs (21.72 usr +  1.28 sys = 23.00 CPU)
Locating 10000 records: 44 wallclock secs (32.20 usr +  1.55 sys = 33.75 CPU)
Match all        10000: 15 wallclock secs (12.15 usr +  0.64 sys = 12.79 CPU)
Delete   10000 records: 38 wallclock secs (15.41 usr +  1.16 sys = 16.57 CPU)
Adding   50000 records: 243 wallclock secs (110.36 usr +  7.18 sys = 117.54 CPU)
Locating 50000 records: 226 wallclock secs (164.97 usr +  7.67 sys = 172.64 CPU)
Match all        50000: 80 wallclock secs (63.09 usr +  3.42 sys = 66.51 CPU)
Delete   50000 records: 219 wallclock secs (82.11 usr +  6.61 sys = 88.72 CPU)
Adding  200000 records: 1293 wallclock secs (483.65 usr + 33.20 sys = 516.85 CPU)

Those 200000 records created a little over 1GB of BerkeleyDB files.

"index cn,dc pres,eq,approx,sub" takes longer to insert/delete without improved search performance:

Adding   10000 records: 104 wallclock secs (26.25 usr +  1.67 sys = 27.92 CPU)
Locating 10000 records: 45 wallclock secs (32.86 usr +  1.55 sys = 34.41 CPU) 
Match all        10000: 15 wallclock secs (12.31 usr +  0.47 sys = 12.78 CPU)
Delete   10000 records: 97 wallclock secs (17.89 usr +  1.34 sys = 19.23 CPU)

The CPU load is a rough 50/50 split between the perl and slapd processes. The perl ldap library is 100% perl, including the network handling and quite some data conversions from arrays into hashes and back, so a pure C client might be a little faster.

Though an index slows down the insertion of PVs, it speeds up certain retrieval methods, as probably desired for an EPICS directory server. The search mechanism is based on round-trip requests. Searching PVs one PV name at a time is slower than requesting all PVs that match a pattern. Unfortunately, typical CA clients will have to request data for specific PV names, not by pattern, so the performance is below the ideal ChannelAccess case where search requests go out onto an idle network.

Some interpolations of the benchmark: A typical SNS LLRF IOCs has 2300 records. On startup, sending those to the directory server would take 10 seconds.

A typical SNS LLRF overview screen has about 400 PVs, and current connection times vary from less than a second to several tens of seconds, depending on how well the search requests reach the IOCs. Resolving their CA servers would take 2 seconds with LDAP, which would then be followed by the time required to actually connect to those servers.

To be investigated

  • How to access it from vxWorks
  • How does an LDAP server used for EPICS cooperate with existing LDAP servers for DNS, email etc. Should it be one and the same? Use special port numbers for 'EPICS' LDAP?
  • API: What type of API would EPICS tools use? Whatever LDAP library they use? An 'EPICS wrapper' around LDAP? Gasper suggested to look at the Java JNDI API for inspiration.
  • Common Database headaches:
    • To simply get a PV into LDAP, one has to check if this PV already exists and then either 'add' or 'modify'.
    • What if a channel is no longer available? An IOC shutting down could remove PVs from the LDAP server, but an IOC that crashes won't.
  • Replication: How to use it, how fast is it etc.
  • Authentication and encryption: How hard is it to configure?