Package org.apache.uima.cas.impl
Class CasSerializerSupport.CasDocSerializer
java.lang.Object
org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
- Enclosing class:
- CasSerializerSupport
Use an inner class to hold the data for serializing a CAS. Each call to serialize() creates its
own instance.
package private to allow a test case to access
not static to share the logger and the initializing values (could be changed)
-
Field Summary
FieldsModifier and TypeFieldDescriptionfinal CASImpl
private final CasSerializerSupport.CasSerializerSupportSerialize
private final ErrorHandler
final IntVector[]
final boolean
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified.final boolean
Whether the serializer needs to check for filtered-out types/features.final boolean
final ListUtils
final MarkerImpl
Used to tell if a FS was created before or after mark.final PositiveIntSet
set of FSs that have multiple references This is for JSON which is computing the multi-refs, not depending on the setting in a feature.boolean
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back inmap from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace stringprivate final IntVector
for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serializationprivate TypeImpl[]
final Comparator<Integer>
sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by idfinal TypeSystemImpl
private final BitSet
final PositiveIntSet_impl
set of FSs that have been enqueued to be serialized Computed during "enqueue" phase, prior to encoding Used to prevent duplicate enqueuing -
Constructor Summary
ConstructorsConstructorDescriptionCasDocSerializer
(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss) CasDocSerializer
(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs) -
Method Summary
Modifier and TypeMethodDescriptionfinal int
classifyType
(int type) Classifies a type.private int
compareFeat
(int o1, int o2, int featCode) private int
compareInts
(int i1, int i2) void
encodeFS
(int addr) Encode an individual FS.private void
void
void
private void
enqueue
(int addr) Enqueue an FS, and everything reachable from it.(package private) int
enqueueCommon
(int addr) private int
enqueueCommon
(int addr, boolean doDeltaAndFilteringCheck) (package private) int
private void
enqueueFeatures
(int addr, int typeCode) Enqueue all FSs reachable from features of the given FS.private void
private void
Enqueue everything reachable from features of indexed FSs.private void
enqueueFSArrayElements
(int addr) Enqueues all FS reachable from an FSArray.private void
enqueueFSListElements
(int addr) Enqueues all FS reachable from an FSList.private void
Enqueues all FS that are stored in the sharedData's id map.private void
add the indexed FSs onto the indexedFSs by view.(package private) void
enqueueIndexedFs
(int viewNumber, int addr) private void
When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified.(package private) int
filterType
(int addr) (package private) int
getNameSpacePrefix
(String uimaTypeName, String nsUri, int lastDotIndex) int
getSofaAddr
(int sofaNum) TypeImpl[]
getXmiId
(int addr) Get the XMI ID to use for an FS.int
getXmiIdAsInt
(int addr) (package private) boolean
isArrayOrList
(int typeCode) private boolean
isArrayType
(int typeCode) private boolean
isListElementsMultiplyReferenced
(int listNode, int featCode) private boolean
isListType
(int typeCode) private boolean
isMultiRef_enqueue
(int featCode, int featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) boolean
isStaticMultiRef
(int featCode) private void
reportMultiRefWarning
(int featCode) void
Starts serializationvoid
-
Field Details
-
cas
-
tsi
-
visited_not_yet_written
set of FSs that have been enqueued to be serialized Computed during "enqueue" phase, prior to encoding Used to prevent duplicate enqueuing -
multiRefFSs
set of FSs that have multiple references This is for JSON which is computing the multi-refs, not depending on the setting in a feature. -
previouslySerializedFSs
-
modifiedEmbeddedValueFSs
-
indexedFSs
-
queue
-
listUtils
-
typeCode2namespaceNames
-
typeUsed
-
needNameSpaces
public boolean needNameSpaces -
nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string -
nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in -
marker
Used to tell if a FS was created before or after mark. -
isDelta
public final boolean isDeltaWhether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object. -
isFiltering
public final boolean isFilteringWhether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer. -
sortedUsedTypes
-
errorHandler
-
filterTypeSystem
-
uniqueStrings
-
isFormattedOutput
public final boolean isFormattedOutput -
csss
-
sortFssByType
sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
-
-
Constructor Details
-
Method Details
-
reportMultiRefWarning
- Throws:
SAXException
-
serialize
Starts serialization- Throws:
Exception
- -
-
getSofaAddr
public int getSofaAddr(int sofaNum) - Parameters:
sofaNum
- - starts at 1- Returns:
- the addr of the sofa FS, or 0
-
writeViewsCommons
- Throws:
Exception
-
getSortedUsedTypes
-
getUsedTypesIterable
-
enqueueIncoming
private void enqueueIncoming()Enqueues all FS that are stored in the sharedData's id map. This map is populated during the previous deserialization. This method is used to make sure that all incoming FS are echoed in the next serialization. It is required if there are out-of-type FSs that are being merged back into the serialized form; those might reference some of these. -
enqueueIndexed
private void enqueueIndexed()add the indexed FSs onto the indexedFSs by view. add the SofaFSs onto the by-ref queue -
enqueueFeaturesOfIndexed
Enqueue everything reachable from features of indexed FSs.- Throws:
SAXException
-
enqueueFeaturesOfFSs
- Throws:
SAXException
-
enqueueCommon
int enqueueCommon(int addr) -
enqueueCommonWithoutDeltaAndFilteringCheck
int enqueueCommonWithoutDeltaAndFilteringCheck(int addr) -
enqueueCommon
private int enqueueCommon(int addr, boolean doDeltaAndFilteringCheck) -
enqueueIndexedFs
void enqueueIndexedFs(int viewNumber, int addr) -
enqueue
Enqueue an FS, and everything reachable from it. This call is recursive with enqueueFeatures, \ and an arbitrary long chain can get stack overflow error. Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106- Parameters:
addr
- The FS address.- Throws:
SAXException
-
isArrayOrList
boolean isArrayOrList(int typeCode) -
isArrayType
private boolean isArrayType(int typeCode) -
isListType
private boolean isListType(int typeCode) -
isListElementsMultiplyReferenced
- Parameters:
curNode
-featCode
-- Returns:
- true if OK, false if found cycle or multi-ref
- Throws:
SAXException
-
isMultiRef_enqueue
private boolean isMultiRef_enqueue(int featCode, int featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) throws SAXException - Throws:
SAXException
-
enqueueFeatures
Enqueue all FSs reachable from features of the given FS.- Parameters:
addr
- address of an FStypeCode
- type of the FSinsideListNode
- true iff the enclosing FS (addr) is a list type- Throws:
SAXException
-
enqueueFSArrayElements
Enqueues all FS reachable from an FSArray.- Parameters:
addr
- Address of an FSArray- Throws:
SAXException
-
enqueueFSListElements
Enqueues all FS reachable from an FSList. This does NOT include the list nodes themselves.- Parameters:
addr
- Address of an FSList- Throws:
SAXException
-
encodeIndexed
- Throws:
Exception
-
encodeFSs
- Throws:
Exception
-
encodeQueued
- Throws:
Exception
-
compareInts
private int compareInts(int i1, int i2) -
compareFeat
private int compareFeat(int o1, int o2, int featCode) -
encodeFS
Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.- Parameters:
addr
- The address to be encoded.- Throws:
SAXException
- passthruException
-
filterType
int filterType(int addr) -
classifyType
public final int classifyType(int type) Classifies a type. This returns an integer code identifying the type as one of the primitive types, one of the array types, one of the list types, or a generic FS type (anything else).The
LowLevelCAS.ll_getTypeClass(int)
method classifies primitives and array types, but does not have a special classification for list types, which we need for XMI serialization. Therefore, in addition to the type codes defined onLowLevelCAS
, this method can return one of the type codes TYPE_CLASS_INTLIST, TYPE_CLASS_FLOATLIST, TYPE_CLASS_STRINGLIST, or TYPE_CLASS_FSLIST.- Parameters:
type
- the type to classify- Returns:
- one of the TYPE_CLASS codes defined on
LowLevelCAS
or on this interface.
-
getXmiId
Get the XMI ID to use for an FS.- Parameters:
addr
- address of FS- Returns:
- XMI ID. If addr == CASImpl.NULL, returns null
-
getXmiIdAsInt
public int getXmiIdAsInt(int addr) -
getNameSpacePrefix
-
getUniqueString
-
getTypeNameFromXmlElementName
-
isStaticMultiRef
public boolean isStaticMultiRef(int featCode)
-