DOS-PROLOG 5.0 - Details
DOS-PROLOG 5.0 is the latest version of LPA's true 32-bit Prolog compiler, and is available both for MS-DOS (version 5 or later) and compatible operating systems. Key features of DOS-PROLOG include:
- Native x87 Floating Point Package: A brand new high-precision, floating point maths package has been written to replace the previously-used emulation library that was used since the earliest days of 386-PROLOG. The new package is faster, more accurate, and has been extensively tested for over a year prior to public release
- XML Support: Extended Markup Language (XML) import and export is supported through a collection of special predicates, which allow XML files or streams to be read as tokens or nested terms, and written with or without indented formatting, to support web-centric and other data processing applications
- Soft Meta-Predicate Definitions: the meta_system/2 predicate can now be modified by the user, in order to produce improved program listings, debugging, call-graph and cross-referencing behaviour where user-defined meta-predicates are employed
- Dynamic Memory Reallocation: all heaps, stacks and major memory buffers can be reconfigured at runtime, during execution of any query, to allow flexible use of resources during large, complex computations
- Full Unicode Support: DOS-PROLOG supports the full Unicode 3.1 character set in files, in addition to the byte-oriented ASCII and ISO/IEC 8859-1 character sets, as well as an arbitrary 32-bit character set
- True Hashed Compilation: DOS-PROLOG includes a special mode of compilation, in which hash tables can be produced for huge Prolog databases to provide highly efficient execution of applications such as WordNet
- True 32-bit Implementation: up to 4Gb (4096Mb) of memory is directly addressable, without complex internal segmented addressing schemes
- Small Memory Requirements: needs as little as 4Mb of memory: as much space as possible is made available for use by user's applications code
- Efficient Runtime System: DOS-PROLOG is implemented in 386 assembler, and employs advanced techniques to achieve outstanding runtime performance, easily exceeding the "MegaLips" barrier on faster 486 machines and most Pentiums
- Edinburgh Standard Syntax: fully conforms to the industry standard syntax, including support for DCGs, term expansion and other advanced features
- Quintus Prolog Compatibility: the system was designed from the outset with QP compatibility as a key objective
- 64-bit Arithmetic: full-featured, efficient double precision built-in floating point maths library complements the 32-bit integer arithmetic
- Incremental and Optimised Compilation: all the flexibility of a traditional interpreter is combined with the runtime speed of fully compiled code
- Source Level and Box Model Debuggers: these make full use of windows and other GUI features to make program testing and debugging as easy as possible
- Operating System Control: full featured access to the operating system gives Prolog programs full control of files, directories, environment variables, time and date, and allows other applications to be executed
- User-definable System Hooks: many events, such as errors, spypoints, timers and messages can be directly programmed in Prolog
- Special Data Types: efficient text manipulation is supported by a true string data type, and four linked data types efficiently support compound terms
- Dynamic-linking MASM Interface: new built-in predicates can be written in 32-bit assembler, and loaded (and unloaded) at run time
- Sophisticated Data Compression: Lempel/Ziv data compression and decompression routines are built in, and are used both for saving/loading system files, and for general user-specified applications
- Powerful Data Encryption: data can be encrypted or decrypted with a unique and uncrackable algorithm, using a comb-filtered Marsaglia/Zaman random number generator whose key size is an amazing 1185 bits
- Secure Hashing and Message Digests: full support is given for a number of industry standard data integrity checks, including the CRC-32 Cyclic Redundancy Check, MD5 Message Digest and SHA-256 Secure Hash Algorithm
- Stand-alone Applications: the Developer edition of DOS-PROLOG allows self-contained, stand-alone applications to be built and distributed; end users need never know that their systems are implemented in Prolog
- Runs Directly from DOS Command Line: despite being a 32-bit, protected mode application, DOS-PROLOG is run just like any other DOS command, but totally unaffected by the 640K limit
- Colourful Windowing Subsystem: a full-featured, text based windowing system is built into DOS-PROLOG to enable colourful, easy to use applications to be written for DOS
- Special System Features: a number of special features give direct access to a timer with sub-millisecond resolution for benchmarking and other applications, and direct video BIOS and mouse control
- GraFiX Toolkit: to help you make the best use of the DOS user interface, DOS-PROLOG comes complete with the GraFiX toolkit, which provides high resolution vector and bitmap graphics
- Full Range of Options: as well as Programmer and Developer editions of DOS-PROLOG, the flex expert system toolkit and Prolog++ object-oriented toolkit options are available now
Unicode and Other Text Formats
By employing a custom text encoding format, known as "UTF-BS", DOS-PROLOG manages to support a full 32-bit character set, and hence Unicode, ISO/IEC 8859-1 and ASCII, with minimal space or processing overheads compared to previous 8-bit character support (typical text applications have no overhead at all, while 8-bit binary text requires an average of only 1.5% more storage than before).
Character-level filtering of input and output is supported on a file-by-file basis, allowing applications to handle multiple character sets simultaneously and transparently, in any of 11 file formats, including ASCII, ISO/IEC 8859-1, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE and UTF-32LE among others. Altogether, DOS-PROLOG's text handling is second to none.
True 32-bit Assembler-coded Implementation
The DOS-PROLOG kernel is implemented entirely in 32-bit assembler, to provide the best overall performance possible on 386, 486 or Pentium platforms. The kernel is built from the same source code as LPA's successful WIN-PROLOG (LPA-PROLOG for Windows), which helps ensure total and continued compatibility between the DOS and Windows versions.
32-bit Memory Access: the 4 Gigabyte "limit"
The most significant feature of DOS-PROLOG is that the kernel is a true 32-bit implementation. All pointers and integers are 32 bits wide, giving DOS-PROLOG the ability directly to address up to 4 gigabytes (4096 megabytes) of memory. You can divide memory in any way you like between program, text and dynamic data: there are no built-in limits or restrictions. The traditional 64K stack and 640K memory barriers of MS-DOS simply cease to exist.
DOS-PROLOG is completely self-contained, and can run on a "raw" MS-DOS machine in the absence of any memory managers, but it is also compatiable with EMS, XMS, VCPI and DPMI memory managers. In the former case, every single byte of memory can be directly accessed by DOS-PROLOG; in the latter cases, certain areas of memory may not be available to DOS-PROLOG, because the memory managers might have reserved them for their own use.
Even though DOS-PROLOG runs in protected mode in order to achieve its impressive memory handling, you still run the system from the traditional DOS command line prompt. All of the necessary 32-bit memory management software is built in, so there is no need to buy or run separate DOS extenders, C compilers or special device drivers. So far as you are concerned, DOS-PROLOG is just another standard application. It just happens to give you direct access to all the memory on your computer...
Thorough Quintus Prolog Compatibility
DOS-PROLOG has been designed from scratch for Quintus Prolog (QP) compatibility. This extends well beyond the obvious requirement of duplicating the built in predicates of QP, as it includes special features such as logical file names, background bookkeeping of predicates/file relationships, and much more. Most applications will port directly from QP to DOS-PROLOG, as the file management support isolates user programs from the intricacies of the host operating system.
Linear Garbage Collection
With all the memory available to the designers, it would have been tempting to skimp on garbage collection features. However, the garbage collection and memory management within DOS-PROLOG is even more complete and advanced than in any of LPA's previous Prolog systems.
The evaluation stacks and heap are managed by a linear garbage collector: doubling the size of evaluation space roughly doubles the time taken to collect garbage on any one occasion, but garbage collection occurs only half as often. The result is that there is no degradation of performance, when even vastly different sizes of evaluation space are used. A fixed size cell allocation scheme avoids any risk of memory fragmentation.
Text space is used to store the bodies of atoms and strings, and is managed as a special heap. It too is fully garbage collectable. Program space is used to store clauses and optimised programs, and is totally reclaimable when these programs are abolished. Both text and program spaces use a special segmented heap format which guarantees that every last byte of memory can be used, with no fragmentation or degradation problems, however many times items have been added or removed.
The Incremental Compiler
DOS-PROLOG is based on an incremental compiler, whereby all clauses in a program are compiled at all times. Because the compiler is incremental, individual clauses can be added by programs - just as in a conventional interpreter. In fact, assert/1 is implemented as a call to the compiler. But DOS-PROLOG goes beyond simple clause level compilation, maintaining dynamic first argument indexing (even into compound terms), which can result in considerable runtime performance improvements over previous LPA Prologs.
Programs can be incrementally decompiled too, allowing predicates like clause/1 and retract/1 (and even listing/1) to be implemented. Both the incremental compilation and decompilation routines are written in assembler code, and are very fast, matching the speed of their interpreter equivalents - but the clauses they manage run some 3-4 times faster than they would in an interpreter.
The Optimising Compiler
The incremental compiler/decompiler does not perform multi-argument indexing, or the space-saving last clause and other optimisations - they would simply be too complex to change on the fly during assert/1 or retract/1. With this in mind, DOS-PROLOG provides an alternative optimising compiler.
The optimising compiler can perform all the optimisations just mentioned, and more. There is full support for multiple-argument indexing, which enable the very fast matching of clauses on any arbitrary argument (as opposed to the single first-argument indexing of the incremental compiler and other Prolog systems), and a complete analysis of variable usage is performed on relations. Any variables which can be left in situ between calls are left so, substantially reducing the data traffic in most programs.
Optimised programs can run some 2-3 or more times faster than incrementally compiled ones, and use less space during execution. Decompilation of these programs is not possible, so your source code will remain completely hidden in applications.
64-bit Floating Point and 32-bit Integer Arithmetic
A fully-featured, 64-bit double precision floating point arithmetic library is built in to DOS-PROLOG. This is accessed directly by the is/2 predicate, which in turn is implemented entirely in 32-bit assembler. The library provides high-speed, high-precision arithmetic computations, and includes support for standard "calculator" functions, trigonometric and logarithmic functions, floating point to integer conversion and truncation functions, maximum and minimum functions, and pseudo random numbers. Direct support is also provided for 32-bit integer style shift, rotate and bitwise logical functions. The most common use of the is/2 predicate is for the simple adding or subtracting of two integers, and this case is specially optimised, being handled entirely in the integer domain wherever possible.
For special systems applications, a 32-bit integer, reverse polish notation evaluator is provided, once again implemented in 32-bit assembler. This evaluator provides very high speed computation of integer expressions, and supports the standard "calculator" functions, bitwise logical functions, and pseudo random numbers.
The carefully researched, pseudo random number generator is useful in a variety of applications, including games, simulations and data encryption. For speed, it is implemented in integer arithmetic, and uses a very high potency linear congruential algorithm, with a period of 2^64. The single, seedable generator is shared between the floating point and integer evaluators, so switching portions of code between these two domains will not violate the integrity of a pseudo random sequence.
Powerful Input and Output Features
One of DOS-PROLOG's great strengths is its powerful collection input and output subsystems. As well as the standard see/1, tell/1, read/1 and write/1 style predicates, support is given for formatted I/O, binary I/O, and the manipulation of a number of special I/O streams, including device and string streams as well as disk files.
Formatted I/O primitives provide complete control of the output of atoms, strings, numbers and other items. Fixed field display, with left or right justification, optional truncation, and free field formats are all supported. Numbers can be output in fixed point, signed or unsigned integer, and even non-decimal base formats (anything from base 2 (binary), through base 16 (hexadecimal) right up to base 36).
Matching formatted input allows structured records to be read in with automatic type checking. As well as the obvious applications such as writing or reading data in tabular formats, the formatted I/O predicates make it easy to build interfaces to ASCII files from other applications.
Special binary I/O predicates help with the interpretation of non-ASCII data in other applications' files, allowing 8, 16, 24 and 32 bit words to be read or written directly. As an example of using both formatted and binary I/O, the Prolog source code of an interface to Dbase III files is included with DOS-PROLOG.
Input and Output Streams and Devices
As well as disk files and the standard "user" device (the console), DOS-PROLOG supports a number of special streams and devices. You can choose between a standard DOS teletype interface and a multi-colour window based environment. In the latter case, "user" is simply a name applied to any given window of your choice.
Raw keyboard input is also supported by DOS-PROLOG, allowing for the single-key control of applications. When a mouse is attached to the system, the buttons are treated as keys in their own right during raw input, a feature which can be useful in debugging, or even for the (relatively) remote control of demonstrations.
For special systems applications, a secondary MDA monitor can be controlled in addition to the primary CGA, EGA, VGA or SVGA monitor. This feature is especially useful in the debugging of graphics programs, where diagnostic messages displayed on the main screen would disrupt the graphical output being tested.
Text String Data Type
While the traditional Prolog "string", namely a list of 8-bit integers, is a powerful means by which to manipulate small pieces of text, it is extremely inefficient as a way of storing text. This inefficiency would be compounded in a 32-bit Prolog system, where each one-character element of the list would need 10 bytes of storage on the dynamic heap. To enable bulk text handling applications, DOS-PROLOG includes a true text string type. Any number of such strings may exist in Prolog terms, and each may contain up to 64Kb of text. Because of the way these strings are stored, garbage collection is still quick and efficient, and even string space cannot become fragmented.
Many built in predicates use strings as a means of passing large amounts of text or binary data around, for example between a window buffer and a disk file. In addition, strings may be used as input and output streams: for example, the call:
would bind the variable String to a string containing the entire text of a program listing. This string could be manipulated, displayed in a window, and processed in virtually any fashion.
Strings provide a data type with a high bandwidth for communications between DOS-PROLOG and low-level data types, such as files and screen buffers. They are of particular importance as parameters in DOS-PROLOG's window-handling predicates.
LZSS Data Compression
An interesting feature in DOS-PROLOG is the provision of a pair of routines for LZSS data compression and decompression. These are used internally by the system state save and restore predicates, but are also available for general use in user programs through two special I/O predicates. A modified version of the original LZ77 algorithm, LZSS uses a sliding window to hold processed data, while a look-ahead buffer peeks into the stream of data yet to be compressed. Depending upon the sizes of sliding window and look-ahead buffer, compression ratios of up to 64:1 are theoretically possible for highly patterned data; in practice, ratios of 2:1 to 4:1 are more usual.
There are many potential applications of LZSS compression: these include the creation of stuffed archive files, which contain an accumulation of compressed files with their names, creation dates/times and attributes, the simple stuffing of individual files, such as bitmaps, which normally use large amounts of disk space, or the compacting of data which needs to be transmitted by modem or some other slow device. More advanced uses might include the creation of mixed database files consisting of uncompressed index information, and compressed data records.
MZSS Data Encryption
As its name suggests, MZSS encryption makes use of a Marsaglia/Zaman pseudo random number generator (PRANG), which has the primary benefit of offering a very large key size (1185 bits, compared with just 64 bits in DOS-PROLOG's existing linear-congruential PRANG!). The MZ/PRANG is seeded by a user-specified password of up to 148 characters, and successive numbers are then combined (XOR) with the plaintext in order to encrypt it, or with the cyphertext in order to decrypt it. Two special features of MZSS encryption make it especially secure: "comb filtering" and "sequence variation".
Although having excellent random properties, and a massively long cycle (well over 2^1180 numbers in the sequence), there is a weakness in the pure MZ/PRANG. If a history of the previous 37 numbers is visible, it is possible to guess the next one +/- 1 bit: if the first 37 characters of a document's plaintext are known, this is quite sufficient for an intelligent attack to decrypt the rest of the document. MZSS encryption prevents such an attack with the help of "comb filtering", in which only a random sample of numbers from the MZ/PRANG sequence are actually used.
Another common attack on simple cyphertext is to find some other document to which both the encrypted and decrypted version are available. Suppose that a document, VULNERABLE, is to be attacked, and the attacker has access to the PLAINTEXT and CYPHERTEXT versions of another document: on the (fairly reasonable) assumption that same password has been used to encode both VULNERABLE and CYPHERTEXT, it is possible to decode the former, at least as far as length of the PLAINTEXT document, by combinding PLAINTTEXT and CYPHERTEXT to obtain an intermediate KEYSEQUENCE, then combining this with VULNERABLE to generate the desired DISCOVERED document. MZSS encryption prevents such an attack by adding additional data to the user's password: this results in "sequence variation", so that even if the same password is used on more than one occasion, the random sequence it generates is completely different each time.
Term and Data Management
There are many special features in DOS-PROLOG geared towards term and data management. An assembler-coded, high speed list/merge sort algorithm gives truly stunning performance. Displaying ln2(X) performance characteristics, it is capable of sorting lists of 10,000 elements in around two seconds on a 25MHz 386. The sort/3 predicate goes beyond the functionality sort/2, by allowing you to define a sort key of arbitrary depth: if the elements in the list you are sorting are complex compound terms, you can uniquely identify which subterm you want to use for sorting. This powerful feature allows you, for example, to sort a given list of names and addresses by surname, street name, town or post code, simply by changing the search key parameter.
Dynamic Linking Low-level Language Interface
A special feature of DOS-PROLOG is its dynamic linking of modules written in 32-bit assembler. Unlike previous versions of LPA Prolog, in which such assembler modules had to be linked into the main Prolog executable file, DOS-PROLOG can dynamically load assembler modules. Each such module may define one or more predicates which behave just like any compiled Prolog program.
Assembler modules need no complex linking or special calling conventions: just load the file, and use the predicates like any other. Furthermore, assembler predicates can be abolished when finished with, allowing their memory to be freed up for other purposes. As with all other areas of DOS-PROLOG memory management, the space used by external modules is not subject to any kind of fragmentation or other forms of degradation as it is used, freed and reused.
The Operating System Interface
As you would expect from LPA's Prolog implementations, DOS-PROLOG has a huge library of operating system interface functions, providing for disk file and directory management, program execution, the reading of environment variable strings, and time and date functions.
Predicates allow you to create, rename and remove files or directories, and you can test and modify file attributes and timestamps. File directories can be read according to file name and attribute matching, returning information about each file's size, time and date.
For specialist time and date applications, a built in predicate allows you to compute absolute day numbers for any given date, or vice versa, allowing days between dates, lunar phases, and other such calculations to be made.
Programmable Interrupt Timers
For special purposes, up to 64 completely independent, programmable timers may be set or tested. When a timer expires, it interrupts the execution of DOS-PROLOG, and calls a specially defined user "hook" predicate. This predicate can perform various operations, before resetting the timer. Because of the design of the timers, real-time synchronisation is maintained, even if an individual hook is delayed in its operation. Uses for timers include timeouts on user input, periodic updating of clock or calendar displays, the writing of sophisticated profile tools, and so on.
Sophisticated Source-level Debugger
A full source-level debugger displays program source code, variable bindings, and other information. A multi-level break facility allows you to escape from the debugger to run supplementary queries before returning. A traditional box model debugger, together with a collection of small, special purpose debuggers, complements the source-level debugger to provide unprecedented flexibility in program testing.
The DOS-PROLOG Windowing Subsystem
One of the key features of DOS-PROLOG is its built-in windowing subsystem. Supporting both text and graphics applications, it permits full use of colour on CGA or compatible graphics adapters, and exceptionally fast screen updating. On EGA or VGA video adapters, it is further capable of supporting features such as 43*80, 50*80 and 60*132 character text modes, and full colour windows support in graphics modes. A full set of predicates allows windows to be created, deleted, moved, coloured, etc, so that highly professional software packages can be written in Prolog. Resizable edit windows provide full scrolling, and there is even a built-in screen saver to protect your VGA screen during long periods of idle time!
Systems Programming Features
DOS-PROLOG contains a number of features which are geared towards systems programming, and which would be difficult or impossible to provide on Windows platforms. Some of these are described below.
For screen handling, direct access is provided to the IBM PC INT 10h BIOS interrupt. This allows you to switch or test screen modes, change the number of lines on a text screen, and many more functions. Likewise, direct access is given to the Microsoft mouse INT 33h control interrupt, allowing complete control over your pointer device.
Elapsed times, down to a resolution of about 0.2ms, can be obtained for accurate benchmarking of even the briefest of events, and an assortment of other predicates allows control over the PC speaker and access to the various shift and control keys.
The GraFiX Interface
As might be expected, DOS-PROLOG is fully compatible with the GraFiX interface. Going well beyond the capabilities of the previously used GSX graphics kernel, GraFiX supports all current screen modes on the VGA, EGA and CGA displays, as well as the high-resolution mode on Compaq Plasma/Olivetti monochrome screens. Full colour can be displayed within the capabilities of any given graphics adapter.
The GraFiX kernel is run separately from Prolog, so that non-graphics applications are not penalised in terms of memory usage. In use, GraFiX takes just 94kb of DOS memory, but this does not interfere with DOS-PROLOG's space. Furthermore, it can easily be removed from memory upon completion of the application.
GraFiX provides a comprehensive set of line, ellipse, filled area, and pixel level functions. Colour, line styles, fill patterns, and the colour palette is all programmable. Text can be displayed in any position, and the size of the font is variable. Bitmaps can be stored and subsequently displayed or combined with existing images (using both the industry standard .PCX or the proprietary .GFX formats). Finally, support is given for using the Microsoft Mouse.
For musical and other multimedia applications, GraFiX includes some MIDI interface functions. Using the IBM PC timer chips for high accuracy timing, these allow note on, note off and MIDI controller messages to be received or transmitted in real time by Prolog programs. Other functions provide support for system exclusive and bulk dump messages. Each of the industry standard MPU401, and serial port "ToHost" and KEY Electronics MIDIator interfaces are supported.