[Home] [Index] [Glossary]

SynDEx v5 Downloader Specification

Context

SynDEx v5 allows the efficient programming of parallel, distributed, heterogeneous architectures, composed of several different types of processors, and of several different types of communication media. From a user specification of an algorithm dataflow graph and of an architecture resources graph, and from algorithm and architecture characterized libraries, SynDEx automatically generates an application specific executive code for each processor, and a makefile to automate the compilation and linking of each executive, and its downloading into the program memory of the corresponding processor.

Separate programming of non-volatile program memories being unpractical, SynDEx considers that each processor has, for only non-volatile resident program, a boot-loader (which may be very small and simple, or may rely on a big and complex operating system) expecting an executive to be downloaded from a neighbour processor through a communication medium, except for a single "host" processor, designated by the name "root" in the specified architecture graph, which boot-loader expects all executives to be stored altogether in its local non-volatile memory.

Consequently, SynDEx computes, over the architecture graph, an oriented coverage tree rooted on the "root" processor, and generates in each processor executive the code needed to download the compiled executives through this tree, in a predetermined order which is also used to generate the makefile.

Boot and Download Process

This process is the same for all processors, except that the root processor gets executives from its local non-volatile memory, whereas all the other processors get executives from their neighbour processor which is their ascendant towards the root of the download tree. The processors which have the same ascendant processor are called the descendants of that processor.

When powered on, each processor boots by executing its resident boot-loader, which gets the processor's executive, loads it into the processor's program memory, and executes it. During its initialization phase, the executive gets and forwards executives to all its descendants, before proceeding with application data processing.

The root processor, usually an embedded PC or other kind of workstation, bootloads from its disk an operating system, which automatically loads and executes a startup program allowing the user to choose between different applications. During early developments, this program may be a simple shell (but this requires a keyboard to be available), and the user enters a "make" command to compile the executives if needed, and to execute the root executive, with the other executive files passed as arguments on the command line. In applications where it is unpractical to use a keyboard permanently connected, the startup program may use another input device (for example a switch or a touch screen) to let the user choose between different predefined shell commands, starting different applications through the corresponding "make" command, or simply launching a shell for interaction with a keyboard. In more deeply embedded applications, where the root processor has neither a disk nor an operating system, all the executives are stored in a FLASH memory, and the root processor boots by executing directly its own executive, and finds the other executives sequentially stored in its FLASH.

The first executive forwarded to a descendant is received, stored, and executed by that descendant's boot-loader. Then, while that descendant's executive asks for executives, the ascendant executive gets and forwards the next executives to the same descendant, until that descendant's executive signals that it has itself no more executives to forward. Then the ascendant may switch to its next descendant, until it has no more descendant to service, and hence no more executive to forward. This fully sequential download process boots processors in the order of a depth-first traversal of the download tree.

In the case of a point-to-point media, the descendant executive may proceed to application data communications as soon as it has no more executive to forward, whereas in the case of a multipoint media, the descendant executive must wait until the ascendant executive signals that it has no more executive to forward (to avoid communication interferences between descendant application data and ascendant download data).

Common Download Format

Each processor type may have a different compiler (linker) output format, and some processor types may have a ROM-ed embedded boot-loader (firmware), with its own requirements on the download format. The SynDEx common download format encapsulates the details and the differences of the compiler output formats, and of the boot-loaders download formats; it is composed as follows:

  1. four bytes prefix encoding the 32 bits big-endian total length of the following sequence of bytes
  2. sequence of bytes, encoding one complete executive, structured as required by the destination boot-loader, and padded if needed with null bytes until the total length is a multiple of four

The first executive forwarded to a descendant being received by that descendant's boot-loader, that executive must be sent WITHOUT its four bytes prefix; the following executives sent to the same descendant being forwarded by that descendant's executive, they must be sent WITH their four bytes prefix.

The sequence of bytes itself must follow the format expected by the destination boot-loader. Therefore a linker post-processor must be developped for each processor type, to translate the linker output file into the SynDEx common dowload format described above. All the post-processors' outputs will be concatenated by the makefile into a unique contiguous image (file), that the root executive will use as source.

Downloader Macros

The downloader code is generated by two macros:

Processor names are usefull to address processors connected to multipoint media: a processor name may be suffixed to give the name of a user defined macro, which substitution gives the processor address.

As executives data may be forwarded through several communication media of different bandwidths, transfers must be synchronized such that data flow at the speed of the slowest media. Between processors, if flow control is not supported by the media hardware, it must be implemented by "ready to receive" control messages sent by the loadFrom_ code for each chunk of data to be sent by the loadOnto_ code. Inside a processor, the loadFrom_ and loadOnto_ macros cooperation is based on the order in which the spawn_thread_ macros (one for each communication sequence, i.e. for each communication media) are generated in the initialization phase of the "main_ ... endmain_" sequence: the spawn_thread_ macro corresponding to the thread_ macro of the communication sequence starting with the loadFrom_ macro (i.e. of the media connected to the ascendant processor) is called first, followed by the other spawn_thread_ macros, among which the ones, if any, corresponding to the communication sequences with a loadOnto_ macro (i.e. of the media connected to the descendant processors).

If the processor is a leaf node of the download tree, its loadFrom_ macro has only one argument; in this case, it directly generates the code sending to the ascendant processor a "null" message meaning that no more executive is requested, followed, in the case of a multipoint media, by the code waiting for other executives to be downloaded to the other processors connected to the media, until the ascendant processor sends an "empty" executive meaning that the download process is complete on this media.

Otherwise, before generating the code described in the previous paragraph, the loadFrom_ macro generates a RETURN instruction (which will return control after the CALL instruction generated by the spawn_thread_ macro), followed by a "loadFrom_end_:" label, and the loadFrom_ macro also defines three macros for use by the loadOnto_ macros:

If the code generated by any of these three macros is limited to a few instructions, it may be generated inline, otherwise the loadFrom_ macro generates this code as a subroutine (between the RETURN instruction and the loadFrom_end_: label), and a call to that subroutine is generated instead of the inline code.

Hardware Specific Downloader Specifications


[Home] [Index] [Glossary]

Last update: 2000/04/17 by: syndex-support@inria.fr