NAME¶
Tcl_NewObj, Tcl_DuplicateObj, Tcl_IncrRefCount, Tcl_DecrRefCount, Tcl_IsShared,
Tcl_InvalidateStringRep - manipulate Tcl values
SYNOPSIS¶
#include <tcl.h>
Tcl_Obj *
Tcl_NewObj()
Tcl_Obj *
Tcl_DuplicateObj(objPtr)
Tcl_IncrRefCount(objPtr)
Tcl_DecrRefCount(objPtr)
int
Tcl_IsShared(objPtr)
Tcl_InvalidateStringRep(objPtr)
ARGUMENTS¶
- Tcl_Obj *objPtr (in)
- Points to a value; must have been the result of a previous call to
Tcl_NewObj.
INTRODUCTION¶
This man page presents an overview of Tcl values (called
Tcl_Objs for
historical reasons) and how they are used. It also describes generic
procedures for managing Tcl values. These procedures are used to create and
copy values, and increment and decrement the count of references (pointers) to
values. The procedures are used in conjunction with ones that operate on
specific types of values such as
Tcl_GetIntFromObj and
Tcl_ListObjAppendElement. The individual procedures are described along
with the data structures they manipulate.
Tcl's
dual-ported values provide a general-purpose mechanism for storing
and exchanging Tcl values. They largely replace the use of strings in Tcl. For
example, they are used to store variable values, command arguments, command
results, and scripts. Tcl values behave like strings but also hold an internal
representation that can be manipulated more efficiently. For example, a Tcl
list is now represented as a value that holds the list's string representation
as well as an array of pointers to the values for each list element.
Dual-ported values avoid most runtime type conversions. They also improve the
speed of many operations since an appropriate representation is immediately
available. The compiler itself uses Tcl values to cache the instruction
bytecodes resulting from compiling scripts.
The two representations are a cache of each other and are computed lazily. That
is, each representation is only computed when necessary, it is computed from
the other representation, and, once computed, it is saved. In addition, a
change in one representation invalidates the other one. As an example, a Tcl
program doing integer calculations can operate directly on a variable's
internal machine integer representation without having to constantly convert
between integers and strings. Only when it needs a string representing the
variable's value, say to print it, will the program regenerate the string
representation from the integer. Although values contain an internal
representation, their semantics are defined in terms of strings: an up-to-date
string can always be obtained, and any change to the value will be reflected
in that string when the value's string representation is fetched. Because of
this representation invalidation and regeneration, it is dangerous for
extension writers to access
Tcl_Obj fields directly. It is better to
access Tcl_Obj information using procedures like
Tcl_GetStringFromObj
and
Tcl_GetString.
Values are allocated on the heap and are referenced using a pointer to their
Tcl_Obj structure. Values are shared as much as possible. This
significantly reduces storage requirements because some values such as long
lists are very large. Also, most Tcl values are only read and never modified.
This is especially true for procedure arguments, which can be shared between
the caller and the called procedure. Assignment and argument binding is done
by simply assigning a pointer to the value. Reference counting is used to
determine when it is safe to reclaim a value's storage.
Tcl values are typed. A value's internal representation is controlled by its
type. Several types are predefined in the Tcl core including integer, double,
list, and bytecode. Extension writers can extend the set of types by defining
their own
Tcl_ObjType structs.
THE TCL_OBJ STRUCTURE¶
Each Tcl value is represented by a
Tcl_Obj structure which is defined as
follows.
typedef struct Tcl_Obj {
int refCount;
char * bytes;
int length;
const Tcl_ObjType * typePtr;
union {
long longValue;
double doubleValue;
void * otherValuePtr;
Tcl_WideInt wideValue;
struct {
void * ptr1;
void * ptr2;
} twoPtrValue;
struct {
void * ptr;
unsigned long value;
} ptrAndLongRep;
} internalRep;
} Tcl_Obj;
The
bytes and the
length members together hold a value's UTF-8
string representation, which is a
counted string not containing null
bytes (UTF-8 null characters should be encoded as a two byte sequence: 192,
128.)
bytes points to the first byte of the string representation. The
length member gives the number of bytes. The byte array must always
have a null byte after the last data byte, at offset
length; this
allows string representations to be treated as conventional null-terminated C
strings. C programs use
Tcl_GetStringFromObj and
Tcl_GetString
to get a value's string representation. If
bytes is NULL, the string
representation is invalid.
A value's type manages its internal representation. The member
typePtr
points to the Tcl_ObjType structure that describes the type. If
typePtr
is NULL, the internal representation is invalid.
The
internalRep union member holds a value's internal representation.
This is either a (long) integer, a double-precision floating-point number, a
pointer to a value containing additional information needed by the value's
type to represent the value, a Tcl_WideInt integer, two arbitrary pointers, or
a pair made up of an unsigned long integer and a pointer.
The
refCount member is used to tell when it is safe to free a value's
storage. It holds the count of active references to the value. Maintaining the
correct reference count is a key responsibility of extension writers.
Reference counting is discussed below in the section
STORAGE MANAGEMENT OF
VALUES.
Although extension writers can directly access the members of a Tcl_Obj
structure, it is much better to use the appropriate procedures and macros. For
example, extension writers should never read or update
refCount
directly; they should use macros such as
Tcl_IncrRefCount and
Tcl_IsShared instead.
A key property of Tcl values is that they hold two representations. A value
typically starts out containing only a string representation: it is untyped
and has a NULL
typePtr. A value containing an empty string or a copy of
a specified string is created using
Tcl_NewObj or
Tcl_NewStringObj respectively. A value's string value is gotten with
Tcl_GetStringFromObj or
Tcl_GetString and changed with
Tcl_SetStringObj. If the value is later passed to a procedure like
Tcl_GetIntFromObj that requires a specific internal representation, the
procedure will create one and set the value's
typePtr. The internal
representation is computed from the string representation. A value's two
representations are duals of each other: changes made to one are reflected in
the other. For example,
Tcl_ListObjReplace will modify a value's
internal representation and the next call to
Tcl_GetStringFromObj or
Tcl_GetString will reflect that change.
Representations are recomputed lazily for efficiency. A change to one
representation made by a procedure such as
Tcl_ListObjReplace is not
reflected immediately in the other representation. Instead, the other
representation is marked invalid so that it is only regenerated if it is
needed later. Most C programmers never have to be concerned with how this is
done and simply use procedures such as
Tcl_GetBooleanFromObj or
Tcl_ListObjIndex. Programmers that implement their own value types must
check for invalid representations and mark representations invalid when
necessary. The procedure
Tcl_InvalidateStringRep is used to mark a
value's string representation invalid and to free any storage associated with
the old string representation.
Values usually remain one type over their life, but occasionally a value must be
converted from one type to another. For example, a C program might build up a
string in a value with repeated calls to
Tcl_AppendToObj, and then call
Tcl_ListObjIndex to extract a list element from the value. The same
value holding the same string value can have several different internal
representations at different times. Extension writers can also force a value
to be converted from one type to another using the
Tcl_ConvertToType
procedure. Only programmers that create new value types need to be concerned
about how this is done. A procedure defined as part of the value type's
implementation creates a new internal representation for a value and changes
its
typePtr. See the man page for
Tcl_RegisterObjType to see how
to create a new value type.
EXAMPLE OF THE LIFETIME OF A VALUE¶
As an example of the lifetime of a value, consider the following sequence of
commands:
This assigns to
x an untyped value whose
bytes member points to
123 and
length member contains 3. The value's
typePtr
member is NULL.
x's string representation is valid (since
bytes is non-NULL) and
is fetched for the command.
The
incr command first gets an integer from
x's value by calling
Tcl_GetIntFromObj. This procedure checks whether the value is already
an integer value. Since it is not, it converts the value by setting the
value's
internalRep.longValue member to the integer
123 and
setting the value's
typePtr to point to the integer Tcl_ObjType
structure. Both representations are now valid.
incr increments the
value's integer internal representation then invalidates its string
representation (by calling
Tcl_InvalidateStringRep) since the string
representation no longer corresponds to the internal representation.
The string representation of
x's value is needed and is recomputed. The
string representation is now
124 and both representations are again
valid.
STORAGE MANAGEMENT OF VALUES¶
Tcl values are allocated on the heap and are shared as much as possible to
reduce storage requirements. Reference counting is used to determine when a
value is no longer needed and can safely be freed. A value just created by
Tcl_NewObj or
Tcl_NewStringObj has
refCount 0. The macro
Tcl_IncrRefCount increments the reference count when a new reference to
the value is created. The macro
Tcl_DecrRefCount decrements the count
when a reference is no longer needed and, if the value's reference count drops
to zero, frees its storage. A value shared by different code or data
structures has
refCount greater than 1. Incrementing a value's
reference count ensures that it will not be freed too early or have its value
change accidentally.
As an example, the bytecode interpreter shares argument values between calling
and called Tcl procedures to avoid having to copy values. It assigns the
call's argument values to the procedure's formal parameter variables. In doing
so, it calls
Tcl_IncrRefCount to increment the reference count of each
argument since there is now a new reference to it from the formal parameter.
When the called procedure returns, the interpreter calls
Tcl_DecrRefCount to decrement each argument's reference count. When a
value's reference count drops less than or equal to zero,
Tcl_DecrRefCount reclaims its storage. Most command procedures do not
have to be concerned about reference counting since they use a value's value
immediately and do not retain a pointer to the value after they return.
However, if they do retain a pointer to a value in a data structure, they must
be careful to increment its reference count since the retained pointer is a
new reference.
Command procedures that directly modify values such as those for
lappend
and
linsert must be careful to copy a shared value before changing it.
They must first check whether the value is shared by calling
Tcl_IsShared. If the value is shared they must copy the value by using
Tcl_DuplicateObj; this returns a new duplicate of the original value
that has
refCount 0. If the value is not shared, the command procedure
“owns” the value and can safely modify it directly. For example,
the following code appears in the command procedure that implements
linsert. This procedure modifies the list value passed to it in
objv[1] by inserting
objc-3 new elements before
index.
listPtr = objv[1];
if ( Tcl_IsShared(listPtr)) {
listPtr = Tcl_DuplicateObj(listPtr);
}
result = Tcl_ListObjReplace(interp, listPtr, index, 0,
(objc-3), &(objv[3]));
As another example,
incr's command procedure must check whether the
variable's value is shared before incrementing the integer in its internal
representation. If it is shared, it needs to duplicate the value in order to
avoid accidentally changing values in other data structures.
SEE ALSO¶
Tcl_ConvertToType(3tcl), Tcl_GetIntFromObj(3tcl),
Tcl_ListObjAppendElement(3tcl), Tcl_ListObjIndex(3tcl),
Tcl_ListObjReplace(3tcl), Tcl_RegisterObjType(3tcl)
KEYWORDS¶
internal representation, value, value creation, value type, reference counting,
string representation, type conversion