Difference between revisions of "V4 Name Server"

From EPICSWIKI
m (size limit clarification)
(Rewrite, OpenLDAP benchmark)
Line 1: Line 1:
== Directory Server ==
== Directory Server ==


As Larry Hoff pointed out, we really want a "Directory" server for
EPICS ChannelAccess currently uses a broadcast mechanism for resolving
EPICS, something beyond a "name server", that will not only map PV
channel names.
names to IP addresses of ChannelAccess servers, but in the future also
Advantages:
provide details on
* Minimal if not zero initial setup.
* At least initially very fast, because a client can send multiple name requests in one network packet, all servers receive it simultaneously and can answer to known names.
 
Disadvantages:
* Unclear how to extend this service to redundant servers, archived data, ...
* Currently, no wildcard searches.
* Currently, no statistics on available number of channels, types of records, ...
* Bigger installations run into problems with broadcast traffic:
** All devices see the requests, even though only one server has a given channel name.
** Continued searches for unresolved names to overcome UDP limitations.
 
As Larry Hoff pointed out, we really want a ''Directory'' server for
EPICS, something beyond a name serve. A directory will not only map PV
names to IP addresses of ChannelAccess servers, but also support
* wildcard searches, reports on number of channels per IOC etc.
* redundant servers
* redundant servers
* archived data servers
* info about reduntant IOCs with the same PV, archived data for a PV
* backup/restore locations
* backup/restore locations for a PV
* site-specific details like engineer-in-charge,
* PV meta information like engineer-in-charge, CA security rules, ...
  CA security rules, ...
 
== LDAP, Directory, RDB, Schema, Entries, dn, cn, dc, ... ==
 
The ''Lightweight Directory Access Protocol'' (LDAP) is a network protocol.
It defines how to access data over the network, regardless of language
or underlying data storage. In contrast, RDBs like Oracle or MySQL define the
data storage and SQL dialect, while the network access to the data is then
added almost as an afterthought.
 
LDAP is meant as a Directory for simple attribute-based information
for fast, high volume lookup. Simple search filters are supported:
Patterns, "and", "or". In contrast to RDBs, there is no support for
arbitrary relations between entries, there are no transactions nor rollback.
 
There are open-source servers for LDAP, and client libraries for
C, Perl, Java and other languages. MacOSX, Linux, Windows 2000
already include LDAP clients. MS ActiveDirectory uses LDAP.
 
LDAP requires the data to be organized into hierarchical ''Entries''.
Each Entry has a globally unique ''Distinguished Name'' (dn),
and various ''Attributes'', that is Type/Value pairs.
The ''Schema'' defines which attributes an entry must have and which ones
it might have. There are existing schemas for DNS-type data or personell
information, including types like ''Common Name'' (cn) or ''Domain Component'' (dc),
but one can add custom schemata. Values might include binary data, e.g. images.
 
=== Mapping EPICS PV Information to LDAP ===
 
Each LDAP entry must have a unique name (dn). EPICS PVs must be unique as well,
but only within the search domain currently defined by the EPICS_CA_ADDR_LIST.
It might make sense to synchronize LDAP servers across these boundaries,
in which case the LDAP dn will have to include more than just the PV name.
For example, one could have a PV "fred" in both the accelerator and office network
of the SNS, and enter them into an SNS-wide LDAP directory like this:
 
fred.linac.sns.epics:  cas="linacioc1:5065"
fred.office.sns.epics: cas="testioc47.sns.ornl.gov:7987", description="Fred's test PV"
 
In this example, the directory includes the ChannelAccess server for both PVs,
and in one case also a description.
 
While an EPICS-specific tool might present the data as shown above,
most generic LDAP tools and APIs use the ''LDAP Data Interchange Format'' (LDIF).
In LDIF, part of the above data could look like this:
 
dn: dc=epics
objectclass: dcObject
objectclass: organization
o: Root of all EPICS PVs
dc: epics
 
dn: dc=sns,dc=epics
# Details omitted; similar to previous entry
 
dn: dc=linac,dc=sns,dc=epics
objectclass: dcObject
objectclass: organization
o: Root of all EPICS PVs in the SNS Linac
dc: linac
 
dn: cn=fred,dc=linac,dc=sns,dc=epics
objectclass: ProcessVariable
cn: fred
cas: linacioc1:5065


== OpenLDAP ==
Some points I had to learn:
* Each entry has an objectclass. The schema for "ProcessVariable" would for example require a "cn" attribute, with optional "cas", "description" and maybe other attributes.<br>The schema would also determine if searches for PV names are case-sensitive or not.
* The hierarchical path "fred.linac.sns.epics" translates into a dn "cn=fred,dc=linac,dc=sns,dc=epics" which lists the relative dn of the entry itself and all path elements.
* Each path element must exists, so to bootstrap one has to create entries "dc=epics" and "dc=sns,dc=epics" etc., for which I used the pre-defined "organization" schema.
* To locate the PV 'fred', any of the following LDAP searches would work:
** Scope "base" with base "cn=fred,dc=linac,dc=sns,dc=epics".
** Scope "one" with base "dc=linac,dc=sns,dc=epics" and filter "(cn=fred)".
** Scope "one" with base "dc=linac,dc=sns,dc=epics" and filter "(cn=f*d)",<br>though this might return more records which happen to match the pattern.
** Scope "sub" with base "dc=sns,dc=epics" and filter "(cn=fred)",<br>which would also return an entry with dn "cn=fred,dc=office,dc=sns,dc=epics".
** There is a filter option "~=" for 'approximately equal'. Unclear what exactly defines approximate, but this might be very handy for detecing typos.


OpenLDAP [http://www.openldap.org]  
=== OpenLDAP ===
promises a Directory for simple attribute-based information
 
for fast, high volume lookup, including filtering.
OpenLDAP  [http://www.openldap.org] is an open-source LDAP
client library for C and an LDAP server which can store the
actual data in /dev/null, a BerkeleyDB, SQL, Perl, a remote LDAP server.


LDAP is well defined, there are existing open-source servers & clients,
it is supported by perl, Java, ..., often already build into the OS (MacOS, Linux).
OpenLDAP includes authentication and encrypted (SSL) transport,
OpenLDAP includes authentication and encrypted (SSL) transport,
replication (single master -> multiple slaves),
replication (single master -> multiple slaves),
and referrals from one server to another one which has more details.
and referrals from one server to another one which has more details.


=== LDAP Terminology ===
=== EPICS PV Tests ===
 
Using OpenLDAP-stable-20050429 and perl-ldap-0.3202 on a 1.3 GHz PowerBook G4
with 780 MB RAM under MacOS X 10.3.8, I used a perl script to create
PV entries, search them, and remove them.
In addition, the 'ldapsearch' that comes with RedHat Enterprise AS 3 was
used to check remote accessibillity.
 
When trying to list all PVs with an appropriate search pattern,
the default server configuration limits the response to 500 answers.
This was changed such that 'paged' searches which keep requesting data
in increments of e.g. 500 entries are allowed to continue until all data is retrieved.
From the command-line tool, that's done like this:
ldapsearch -x -LLL -E pr=10/noprompt -b 'dc=epics' '(cn=testpv*)'  cas
 
=== Performance ===
 
Based on the default OpenLDAP server config,
PVs with names 'testrec0000000001', 'testrec0000000002' etc.
were added, then looked up one-by-one via their exact name,
then via a pattern 'testrec*', and finially deleted one by one.


LDAP maintains a hierarchical directory of Entries.
In the first test, the individual PV search actually used an LDIF
Each Entry has a globally unique distinguished name (dn),
filter with exact match like 'testrec0000000001':
and various Attributes, that is Type/Value pairs.
Adding  1000 records:  4 wallclock secs ( 2.29 usr +  0.14 sys =  2.43 CPU)
Certain types like "Common Name" (cn) are predefined,
Locating 1000 records: 49 wallclock secs ( 4.10 usr +  0.21 sys =  4.31 CPU)
but one can define a custom schema. Values might include binaries
Match all        1000:  1 wallclock secs ( 1.18 usr +  0.02 sys =  1.20 CPU)
like for example images.
Delete  1000 records:  4 wallclock secs ( 1.35 usr +  0.14 sys =  1.49 CPU)


=== Step 0: EPICS PVs and CA Server Info ===
When locating individual records via the complete LDIF base without a filter,
At least for now, PV names must be unique within the EPICS_CA_ADDR_LIST that
the search is much faster:
the participating parties use.
Locating 1000 records:  5 wallclock secs ( 3.20 usr +  0.18 sys =  3.38 CPU)
Will directory servers provide information across such boundaries,
for example site-wide or even larger?
In that case, the entries in the directory server need more specific
names that are unique beyond e.g. a "linac" subnet.


I created a simple "EPICS" schema for PVs and CA server addresses,
When adding an "index cn pres,eq" to the server config, the results are the
that allows entry of PV names in a hierarchical name space.
same for both the filter and the base search case.
Extract from an "LDIF" file that defines names:
No difference observed when tripling the cachesize from the default of 1000.
The behavior scales linearly with the number of channels,
always a little over 200 additions or lookups per second:
Adding  10000 records: 47 wallclock secs (21.72 usr +  1.28 sys = 23.00 CPU)
Locating 10000 records: 44 wallclock secs (32.20 usr +  1.55 sys = 33.75 CPU)
Match all        10000: 15 wallclock secs (12.15 usr +  0.64 sys = 12.79 CPU)
Delete  10000 records: 38 wallclock secs (15.41 usr +  1.16 sys = 16.57 CPU)


  dn: cn=llrf,cn=linac,dc=epics
  Adding  50000 records: 243 wallclock secs (110.36 usr +  7.18 sys = 117.54 CPU)
  objectclass: NameSpaceElement
Locating 50000 records: 226 wallclock secs (164.97 usr +  7.67 sys = 172.64 CPU)
  cn: llrf
  Match all        50000: 80 wallclock secs (63.09 usr +  3.42 sys = 66.51 CPU)
  Delete  50000 records: 219 wallclock secs (82.11 usr +  6.61 sys = 88.72 CPU)


dn: cn=DTL_LLRF:FCM1:cavV,cn=llrf,cn=linac,dc=epics
"index cn,dc pres,eq,approx,sub" takes longer to insert/delete
objectclass: ProcessVariable
without improved search performance:
  cn: DTL_LLRF:FCM1:cavV
  Adding  10000 records: 104 wallclock secs (26.25 usr +  1.67 sys = 27.92 CPU)
  cas: 192.1.2.3:46
  Locating 10000 records: 45 wallclock secs (32.86 usr +  1.55 sys = 34.41 CPU)
  description: Example PV
Match all        10000: 15 wallclock secs (12.31 usr +  0.47 sys = 12.78 CPU)
  Delete  10000 records: 97 wallclock secs (17.89 usr +  1.34 sys = 19.23 CPU)
 
The CPU load is a rough 50/50 split between the perl and slapd processes.
The perl ldap library is 100% perl, including the network handling and
quite some data conversions from arrays into hashes and back,
so a pure C client might be a little faster.


Used a perl script to create an LDIF file with 10000 PV names
Though an index slows down the insertion of PVs, it speeds up certain
and imported that with the ldapadd command-line tool.
retrieval methods, as probably desired for an EPICS directory server.
The search mechanism is based on round-trip requests.
Searching PVs one PV name at a time is slower than requesting all
PVs that match a pattern. Unfortunately, typical CA clients will
have to request data for specific PV names, not by pattern,
so the performance is below the ideal ChannelAccess case where search requests
go out onto an idle network.


A command-line tool 'ldapsearch' can be used to search PVs:
Some interpolations of the benchmark:
ldapsearch -x -b 'cn=llrf,cn=linac,dc=epics' '(cn=testpv00010)'  cas
A typical SNS LLRF IOCs has 2300 records.
retrieves the ChannelAccess Server for the given PV.
On startup, sending those to the directory server would take 10 seconds.


When trying to list all PVs under e.g. the name-space '*.llrf.linac.epic'.
A typical SNS LLRF overview screen has about 400 PVs,
the default server configuration limits the response to 500 answers.
and current connection times vary from less than a second to
This can be changed to either 'unlimited' (probably a bad idea) or
several tens of seconds, depending on how well the search requests
such that 'paged' searches which keep requesting data in increments of e.g. 500 entries
reach the IOCs.
are allowed to continue until all data is retrieved.
Resolving their CA servers would take 2 seconds with LDAP,
From the command-line tool, that's done like this:
which would then be followed by the time required to actually connect
ldapsearch -x -LLL -E pr=10/noprompt -b 'cn=llrf,cn=linac,dc=epics' '(cn=testpv*)'  cas
to those servers.


=== To be investigated ===
=== To be investigated ===


* Before late 1990, there was LDAPv2, which is now obsolete. How stable is the current LDAPv3?
* Performance
* How to access it from vxWorks
* How to access it from vxWorks
* How does an LDAP server used for EPICS cooperate with existing LDAP servers for DNS, email etc. Should it be one and the same? Use special port numbers for 'EPICS' LDAP?
* How does an LDAP server used for EPICS cooperate with existing LDAP servers for DNS, email etc. Should it be one and the same? Use special port numbers for 'EPICS' LDAP?
Line 80: Line 199:
** What if a channel is no longer available? An IOC shutting down could remove PVs from the LDAP server, but an IOC that crashes won't.
** What if a channel is no longer available? An IOC shutting down could remove PVs from the LDAP server, but an IOC that crashes won't.
* Replication: How to use it, how fast is it etc.
* Replication: How to use it, how fast is it etc.
* authentication and encryption: How hard is it to configure?
* Authentication and encryption: How hard is it to configure?

Revision as of 21:57, 26 May 2005

Directory Server

EPICS ChannelAccess currently uses a broadcast mechanism for resolving channel names. Advantages:

  • Minimal if not zero initial setup.
  • At least initially very fast, because a client can send multiple name requests in one network packet, all servers receive it simultaneously and can answer to known names.

Disadvantages:

  • Unclear how to extend this service to redundant servers, archived data, ...
  • Currently, no wildcard searches.
  • Currently, no statistics on available number of channels, types of records, ...
  • Bigger installations run into problems with broadcast traffic:
    • All devices see the requests, even though only one server has a given channel name.
    • Continued searches for unresolved names to overcome UDP limitations.

As Larry Hoff pointed out, we really want a Directory server for EPICS, something beyond a name serve. A directory will not only map PV names to IP addresses of ChannelAccess servers, but also support

  • wildcard searches, reports on number of channels per IOC etc.
  • redundant servers
  • info about reduntant IOCs with the same PV, archived data for a PV
  • backup/restore locations for a PV
  • PV meta information like engineer-in-charge, CA security rules, ...

LDAP, Directory, RDB, Schema, Entries, dn, cn, dc, ...

The Lightweight Directory Access Protocol (LDAP) is a network protocol. It defines how to access data over the network, regardless of language or underlying data storage. In contrast, RDBs like Oracle or MySQL define the data storage and SQL dialect, while the network access to the data is then added almost as an afterthought.

LDAP is meant as a Directory for simple attribute-based information for fast, high volume lookup. Simple search filters are supported: Patterns, "and", "or". In contrast to RDBs, there is no support for arbitrary relations between entries, there are no transactions nor rollback.

There are open-source servers for LDAP, and client libraries for C, Perl, Java and other languages. MacOSX, Linux, Windows 2000 already include LDAP clients. MS ActiveDirectory uses LDAP.

LDAP requires the data to be organized into hierarchical Entries. Each Entry has a globally unique Distinguished Name (dn), and various Attributes, that is Type/Value pairs. The Schema defines which attributes an entry must have and which ones it might have. There are existing schemas for DNS-type data or personell information, including types like Common Name (cn) or Domain Component (dc), but one can add custom schemata. Values might include binary data, e.g. images.

Mapping EPICS PV Information to LDAP

Each LDAP entry must have a unique name (dn). EPICS PVs must be unique as well, but only within the search domain currently defined by the EPICS_CA_ADDR_LIST. It might make sense to synchronize LDAP servers across these boundaries, in which case the LDAP dn will have to include more than just the PV name. For example, one could have a PV "fred" in both the accelerator and office network of the SNS, and enter them into an SNS-wide LDAP directory like this:

fred.linac.sns.epics:  cas="linacioc1:5065"
fred.office.sns.epics: cas="testioc47.sns.ornl.gov:7987", description="Fred's test PV"

In this example, the directory includes the ChannelAccess server for both PVs, and in one case also a description.

While an EPICS-specific tool might present the data as shown above, most generic LDAP tools and APIs use the LDAP Data Interchange Format (LDIF). In LDIF, part of the above data could look like this:

dn: dc=epics
objectclass: dcObject
objectclass: organization
o: Root of all EPICS PVs
dc: epics
dn: dc=sns,dc=epics
# Details omitted; similar to previous entry
dn: dc=linac,dc=sns,dc=epics
objectclass: dcObject
objectclass: organization
o: Root of all EPICS PVs in the SNS Linac
dc: linac
dn: cn=fred,dc=linac,dc=sns,dc=epics
objectclass: ProcessVariable
cn: fred
cas: linacioc1:5065 

Some points I had to learn:

  • Each entry has an objectclass. The schema for "ProcessVariable" would for example require a "cn" attribute, with optional "cas", "description" and maybe other attributes.
    The schema would also determine if searches for PV names are case-sensitive or not.
  • The hierarchical path "fred.linac.sns.epics" translates into a dn "cn=fred,dc=linac,dc=sns,dc=epics" which lists the relative dn of the entry itself and all path elements.
  • Each path element must exists, so to bootstrap one has to create entries "dc=epics" and "dc=sns,dc=epics" etc., for which I used the pre-defined "organization" schema.
  • To locate the PV 'fred', any of the following LDAP searches would work:
    • Scope "base" with base "cn=fred,dc=linac,dc=sns,dc=epics".
    • Scope "one" with base "dc=linac,dc=sns,dc=epics" and filter "(cn=fred)".
    • Scope "one" with base "dc=linac,dc=sns,dc=epics" and filter "(cn=f*d)",
      though this might return more records which happen to match the pattern.
    • Scope "sub" with base "dc=sns,dc=epics" and filter "(cn=fred)",
      which would also return an entry with dn "cn=fred,dc=office,dc=sns,dc=epics".
    • There is a filter option "~=" for 'approximately equal'. Unclear what exactly defines approximate, but this might be very handy for detecing typos.

OpenLDAP

OpenLDAP [1] is an open-source LDAP client library for C and an LDAP server which can store the actual data in /dev/null, a BerkeleyDB, SQL, Perl, a remote LDAP server.

OpenLDAP includes authentication and encrypted (SSL) transport, replication (single master -> multiple slaves), and referrals from one server to another one which has more details.

EPICS PV Tests

Using OpenLDAP-stable-20050429 and perl-ldap-0.3202 on a 1.3 GHz PowerBook G4 with 780 MB RAM under MacOS X 10.3.8, I used a perl script to create PV entries, search them, and remove them. In addition, the 'ldapsearch' that comes with RedHat Enterprise AS 3 was used to check remote accessibillity.

When trying to list all PVs with an appropriate search pattern, the default server configuration limits the response to 500 answers. This was changed such that 'paged' searches which keep requesting data in increments of e.g. 500 entries are allowed to continue until all data is retrieved. From the command-line tool, that's done like this:

ldapsearch -x -LLL -E pr=10/noprompt -b 'dc=epics' '(cn=testpv*)'  cas

Performance

Based on the default OpenLDAP server config, PVs with names 'testrec0000000001', 'testrec0000000002' etc. were added, then looked up one-by-one via their exact name, then via a pattern 'testrec*', and finially deleted one by one.

In the first test, the individual PV search actually used an LDIF filter with exact match like 'testrec0000000001':

Adding   1000 records:  4 wallclock secs ( 2.29 usr +  0.14 sys =  2.43 CPU)
Locating 1000 records: 49 wallclock secs ( 4.10 usr +  0.21 sys =  4.31 CPU)
Match all        1000:  1 wallclock secs ( 1.18 usr +  0.02 sys =  1.20 CPU)
Delete   1000 records:  4 wallclock secs ( 1.35 usr +  0.14 sys =  1.49 CPU)

When locating individual records via the complete LDIF base without a filter, the search is much faster: Locating 1000 records: 5 wallclock secs ( 3.20 usr + 0.18 sys = 3.38 CPU)

When adding an "index cn pres,eq" to the server config, the results are the same for both the filter and the base search case. No difference observed when tripling the cachesize from the default of 1000. The behavior scales linearly with the number of channels, always a little over 200 additions or lookups per second:

Adding   10000 records: 47 wallclock secs (21.72 usr +  1.28 sys = 23.00 CPU)
Locating 10000 records: 44 wallclock secs (32.20 usr +  1.55 sys = 33.75 CPU)
Match all        10000: 15 wallclock secs (12.15 usr +  0.64 sys = 12.79 CPU)
Delete   10000 records: 38 wallclock secs (15.41 usr +  1.16 sys = 16.57 CPU)
Adding   50000 records: 243 wallclock secs (110.36 usr +  7.18 sys = 117.54 CPU)
Locating 50000 records: 226 wallclock secs (164.97 usr +  7.67 sys = 172.64 CPU)
Match all        50000: 80 wallclock secs (63.09 usr +  3.42 sys = 66.51 CPU)
Delete   50000 records: 219 wallclock secs (82.11 usr +  6.61 sys = 88.72 CPU)

"index cn,dc pres,eq,approx,sub" takes longer to insert/delete without improved search performance:

Adding   10000 records: 104 wallclock secs (26.25 usr +  1.67 sys = 27.92 CPU)
Locating 10000 records: 45 wallclock secs (32.86 usr +  1.55 sys = 34.41 CPU) 
Match all        10000: 15 wallclock secs (12.31 usr +  0.47 sys = 12.78 CPU)
Delete   10000 records: 97 wallclock secs (17.89 usr +  1.34 sys = 19.23 CPU)

The CPU load is a rough 50/50 split between the perl and slapd processes. The perl ldap library is 100% perl, including the network handling and quite some data conversions from arrays into hashes and back, so a pure C client might be a little faster.

Though an index slows down the insertion of PVs, it speeds up certain retrieval methods, as probably desired for an EPICS directory server. The search mechanism is based on round-trip requests. Searching PVs one PV name at a time is slower than requesting all PVs that match a pattern. Unfortunately, typical CA clients will have to request data for specific PV names, not by pattern, so the performance is below the ideal ChannelAccess case where search requests go out onto an idle network.

Some interpolations of the benchmark: A typical SNS LLRF IOCs has 2300 records. On startup, sending those to the directory server would take 10 seconds.

A typical SNS LLRF overview screen has about 400 PVs, and current connection times vary from less than a second to several tens of seconds, depending on how well the search requests reach the IOCs. Resolving their CA servers would take 2 seconds with LDAP, which would then be followed by the time required to actually connect to those servers.

To be investigated

  • How to access it from vxWorks
  • How does an LDAP server used for EPICS cooperate with existing LDAP servers for DNS, email etc. Should it be one and the same? Use special port numbers for 'EPICS' LDAP?
  • API: What type of API would EPICS tools use? Whatever LDAP library they use? An 'EPICS wrapper' around LDAP? Gasper suggested to look at the Java JNDI API for inspiration.
  • Common Database headaches:
    • To simply get a PV into LDAP, one has to check if this PV already exists and then either 'add' or 'modify'.
    • What if a channel is no longer available? An IOC shutting down could remove PVs from the LDAP server, but an IOC that crashes won't.
  • Replication: How to use it, how fast is it etc.
  • Authentication and encryption: How hard is it to configure?