In my experience, there are two general cases of insertion into a table during
a procedure: the table is normally one of the STACK tables or GENERALLOAD, but
in certain cases can be another table, normally a private one that has been
defined specially for the procedure. The identity of the table is not
important in the cases that I am going to describe. The two types might be
called 'explicit' and 'implicit', with reference to the key field of the table
into which data will be inserted.
I am going to describe in general terms the copying of a customer order. This
would use the GENERALLOAD table; the fields of the order header would go into
a tuple whose value for RECORDTYPE would be '1' and whose LINE would be 1. The
technique for copying the order lines depends on whether only the lines are
being copied, or whether any sub-forms of the lines are being copied as well.
In the first case (no sub-forms), one can simply write
INSERT INTO GENERALLOAD (LINE, RECORDTYE, ...
SELECT 1 + ORDERITEMS.LINE, '2', ...
If one were feeling adventurous, or there was no natural key for the sub-form,
one could replace ORDERITEMS.LINE with SQL.LINE. What is important is that this
number is incremented by one every time, as line 1 in GENERALLOAD holds the
header line. This is what I would describe as 'implicit' inserting.
Should there be sub-forms, the data has to be entered by means of a cursor,
where first line data is added then sub-form data. As there will no longer be
any correspondence between GENERALLOAD.LINE and ORDERITEMS.LINE, one has to
maintain a local variable (normally :LINE) whose value is incremented prior to
every insert. This is 'explicit' inserting. At the same time, data for the
sub-form could be inserted either implicitly or explicitly.
:LINE = 1;
INSERT INTO GENERALLOAD (LINE, RECORDTYE, ...
SELECT :LINE, '1', {header data};
DECLARE C1 CURSOR FOR
SELECT ORDERITEMS.ORDI, ....
OPEN C1;
GOTO 200 WHERE :RETVAL <= 0;
LABEL 100;
FETCH C1 INTO :ORDI, ....
GOTO 200 WHERE :RETVAL <= 0;
:LINE = :LINE + 1;
INSERT INTO GENERALLOAD (LINE, RECORDTYPE, ...
VALUES (:LINE, '2', {line data} ...);
/* sub-form */
INSERT INTO GENERALLOAD (LINE, RECORDTYPE, ...
SELECT :LINE + SQL.LINE, '3', {sub-form data} ...;
LOOP 100;
LABEL 200;
CLOSE C1;
EXECUTE INTERFACE ....
There is a deliberate mistake in the above code, but first let's think it
through. The value of GENERALLOAD.LINE for header data will obviously be 1, and
the value of this field for the first line's data will be 2 (as the variable
:LINE is explicitly incremented). For the first line of the sub-form data,
GENERALLOAD.LINE will be :LINE + SQL.LINE, i.e. 2 + 1, or 3. The second line
will have GENERALLOAD.LINE = 4. This can be represented in the following table
GENERALLOAD.LINE |
DATA |
1 |
order header |
2 |
first line of order
|
3 |
first line of sub-form for order line 1
|
4 |
second line of sub-form for order line 1
|
So far so good. For the second order line, :LINE will explicitly be
incremented, so this line will have GENERALLOAD.LINE = 3 ... except for the
fact that there is already a tuple with this key value in GENERALLOAD, and
so the second line will not be inserted. What is missing is the following
line:
SELECT MAX (LINE) INTO :LINE FROM GENERALLOAD;
This line should appear just before LOOP 100. As a result of this line,
:LINE will have the value 4 after the insertion of the second sub-form line,
and as this value is incremented prior to inserting the second order line,
this line will have GENERALLOAD.LINE = 5.
One can generalise this: whenever one uses the construct 'INSERT INTO ...
SELECT SQL.LINE', one must remember to increment SQL.LINE with a variable
(such as :LINE or :MAX), then after the INSERT statement should come the
statement that selects the maximum key number inserted so far into the above
variable.
Such a simple heuristic, so easy to forget: if this variable (:LINE) is not
used again, it seems that there is no point in extracting its value. For
example, I have written many procedures that send a report by email: the
procedure starts by defining some variables, then there is an insert statement
into STACK4 (or similar) based on those variables, using SQL.LINE as the key
field, then a report is executed passing STACK4 as its data. There is no loop
and so there is no need to select the maximum value of KEY from STACK4.
The problem rears its head when such code - which was written for a
non-looping procedure - gets copied into a procedure that does have a
loop. For example, I wrote a somewhat complicated procedure that
collects BOM data from all the lines in a given order; this was then extended
to work on several orders. Each order is selected via a cursor; local
variables have to be reinitialised for each new order, but as the data are
being inserted into the same STACK table, the key value should continue to
increment. In other words, the maximum value of KEY should be extracted before
the LOOP command that causes a new order to be selected.
This blog is of course being written because I fell foul of this heuristic. In
this case, the procedure had to call an external program for calculating
budget use; this program appears to work on one year's data at a time. As a
result, I was forced to use the rather arcane structure of having a loop at
the level of SQLI stages as shown below.
Stage |
remarks |
10 |
Set up variables
|
20 |
increment variables
|
30 |
external procedure
|
40 |
insert the data returned from the external procedure into a special table.
At the end check the terminating condition and set :$.GO appropriately
|
50 |
GOTO 20 if the end has not been reached, 60 if it has been reached
|
60 |
cleanup
|
One can guess what the problem was: the code in stage 40 had originally been written for one year's data and used the 'INSERT INTO ... SELECT SQL.LINE' construct without having 'SELECT MAX (LINE)' at its end. When the procedure was run, it seemed to work. Let's say that the first year had 200 lines to be inserted, the second year 300 lines and the third year 100 lines. In this case, the 200 lines of the first year would get inserted without problem; the first 200 lines of the second year would not be inserted because SQL.LINE would return values that had already been inserted, but the final 100 lines (i.e. 201-300) would be inserted, giving the impression that the procedure worked for this year. Data from the final year would not be inserted at all.
It took me quite some time to figure out what the problem was; this was exacerbated by the facts that the procedure was working on a client's data (that are unfamiliar to me) and that the procedure was based on complicated code involving budgets (that too are somewhat unfamiliar to me). All the checks that I inserted (sorry for the inadvertent pun) showed the expected results, but somehow data was not being inserted into the table. I think that I wrote about a similar problem years ago: when data does not get inserted into a table, check its primary key.
It was less clear in this case because the loop construct was not within the same stage as the data insertion, but even so .... After 'INSERT INTO ... SELECT SQL.LINE', ALWAYS add 'SELECT MAX (LINE)' at the insert statement's end.