Talk:V4 Design: epicsTypes

From EPICSWIKI

Here are my comments

1) All of the readers here probably already know that I feel that the unsigned types in C are useful. Well written code range checks operands, and unsigned operands typically require less range checking. When problems occur it's because users convert between signed and unsigned types without proper range checking. Since data access has this range checking built in (after considerable effort on mine and Ralph's part I will add) then conversion problems are possibly no longer an issue.

2) You have created interface classes for buffers, arrays, enumerated, strings etc. All of these interfaces are already in data access. Was there a good justification to create new ones? In any case, it seems that we should have the goal of using the same interfaces for each of the fundamental data types?

3) These interfaces seem to require contiguous storage of random sized blocks of characters. Doesn’t this preclude use of fixed length buffer based (free list based) memory allocation, and therefore almost guarantee fragmentation of memory. Note that fragmentation isnt just a binary condition (memory of a particular size is or isnt available). Fragmentation is also an efficency isssue because most random sized block memory allocators require searching for the best sized free block. Note that one of my goals for V4 is to no longer require EPICS_CA_MAX_ARRAY_BYTES.

More comments:

> > In any case, it seems > > that we should have the goal of using the same interfaces for each of > > the fundamental data types? > Agreed about the goal, but your interfaces don't necessarily take into > account all of our requirements, and we think some of your interfaces > are stricter than they need to be, which harms performance. I suspect > we would have few problems implementing most of your interfaces on top > of ours, but adopting ours should be more efficient since you don't have > to do everything one character or element at a time - we expose a whole > segment at once.

1) I think that the above comment applies only to strings as all other types in data access are accessed in chunks - allowing the highest level of performance.

2) The data access string interface StringSegment certainly does force all string access through a stream interface indexed int wide getChar/putChar. I wanted to support all character widths through one interface. I was inclined to conclude that if they are communicating in strings they can't be too concerned about performance. I was also, I admit, not thinking about UTF-8.

3) Since your interface allows writing multiple characters to the string at a time doesn’t this force a UTF-8 implementation of complex characters (contrary to what I seem to recall reading in your doc)?

4) Forcing UTF-8 probably isn’t a bad idea. I would be amenable to just saying that if we will have wide characters they will be UTF-8. This would be a real nice simplification of course when compressing the data for the CA protocol. The StringSegment interface in data access certainly could be revised to allow blocks of characters to be read/written in UTF-8 format (assuming that all strings are streams of octets).

5) We should carefully consider implementations and interfaces, and how they may become dependent on each other. Of course, for review, interfaces are pure virtual and implementations have data in them. For fundamental types we should be careful about making implementation dependent on implementations because we can end up with code maintenance issues (been there, done that). I notice that EpicsEnum is an implementation and it is based on EpicsString - another implementation. In contrast, the data access StringSegment interface is just an interface. Another example is EpicsArray which is admitted to not be type safe. That might be fine as an internal scaffolding for an array, but when crossing boundaries between components the interface presented by data access might be more appropriate because it *is* type safe.

6) I used a stream type of interface as a foundation for the StringSegment interface. This appears to be a nice approach in my IMHO. By the way, I did look at basing strings on stream strings in the C++ standard library but gave up on that when I realized that most implementations were calling malloc in the stream objects constructor - too much overhead for temporary object mapping protocol streams so that they can cross interfaces. The fundamental problem here by the way is that the standard library made the core stream an implementation, and not an interface.

7) Dejavu. I am always defining interfaces without epics in their name which are subsequently superseded by revised interfaces that do. Who owns this EPICS name anyways? This is vaguely reminiscent of IBM calling there computer the "personal computer" and Microsoft calling there window system "windows" - whoever owns the most generic name wins :-). Is it somewhat redundant to have to type epics every time that I type an interface name? Should there be a C++ namespace called "epics"? Is that too course of a granularity for a name space, and possibly also too limiting for generic things like fundamental types (is this buffer only useable with EPICS)?

8) I notice that some of these classes hand off pointers to internal storage. EpicsBuffer does this of course when we are granted access to a buffer segment, and some of the other do it also when they return a pointer to the buffer creator (it should probably be a reference that is returned). The problem is that users may continue to use handles to internal objects after the object that owns them (and manages there existence) don’t exist. Admittedly, a buffer is a very low level object and you might make exceptions for performance reasons. However, when these C++ systems become large and complex object lifespan mistakes seem to be one of the primary failure modes. In any case I tried to avoid this in data access by making all buffer internal access occur only during the duration of the callback. It's clear in the callback that we can only access the incoming arguments for the duration of the callback. The performance is the same because we still have direct access to large chunks.

9) I really do feel that using unsigned types for integers that should never be negative is best. Oh well, perhaps I am the only one yelling this out from the bottom of a well, and no one is listening anyways.

> Go back and read the section on the EpicsBuffer interface again; > none of our classes require contiguous buffer storage.

Sorry, after reflecting more carefully on the code snippets (yes, I do that too) I now understand what is being done. This buffer thingy is an implementation, and not an interface, and yes it does allow segmented storage.

> Since we never really write down and agree what all the detailed > requirements are, but just present each other with code snippets that > implement our view of them, I think we're going to continue to have > disagreements like these. It would speed things up immensely if we > could first agree in writing on what we are going to accomplish *before* > we actually start implementing it (and I plead as guilty as everyone > else on that charge).

I think that this is another way of expressing a desire for better coordination of the overall design process. Coordination is good and more of it can't be worse than the current situation (almost none). However, dialog like this shouldn’t be seen as a problem either. Coordination without dialog (multiple voices) would probably be worse.

> The current definition of epicsTypes does NOT include unsigned > types. The reason is Java. If we allow unsigned types it makes > it very difficult to complely support Java.

There is possibly another way of looking at this. With Java we must use an integer type that has sufficient range to hold the magnitude of the integer number used in C. This statement is true irregardless of whether the C type is a signed or unsigned integer. Data Access will of course detect problems where the integer magnitude in the server does not map to the range of the data type of the client, and report them as exception call backs. We use unsigned integer types because their domain fits better with the arithmetic that is being conducted, and not because of their slightly higher positive range. Relying on the extra range would be flying too close to the trees IMHO. With bit fields if we have the option to address the bit field by its offset and length then we will almost never need to perform arithmetic directly on an unsigned 32 or 64 bit field. That’s a good thing. Otherwise Java might find itself pickled when communicating with hardware ;-)

> I am again haunted by Doug's request for "hello world".

If we provide some basic types ala db_access.h that are already interfaced by data access then data access shouldn’t need to be seen by naïve users. No doubt that we will layer a caGet and a caPut which will take "hello world" as a parameter.