# =========== Header =========== # File: Paged Stores Format # Project: ATA Support for NewtonOS # Written by: Paul Guyot (pguyot@kallisys.net) # # Created on: 07/17/2000 # Internal version: 1 # # Copyright: © 2000-2003 by Paul Guyot. # All rights reserved worldwide. # =========== # =========== Change History =========== # 07/14/2003 v3 [PG] Updated for version 4 # 12/16/2000 v2 [PG] New version of the format # 07/17/2000 v1 [PG] Creation of the file # =========== Abstract ======== This document explains how data is stored into Paged Stores with ATA Support. Paged Stores are persistent object stores for NewtonOS with 512 bytes aligned transactions. Introduction ============ ATA media is not at all like Flash media because there is no need to use 64 KB erase blocks. The minimum for each transaction is 512 bytes (a sector). Newton Stores are persistent object stores. Objects are referred by a 32 bits ID. The maximum size for Flash Stores is 0xFFFF bytes (65535). Objects can be created, deleted, resized, partially and entirely read and written. This format is an answer to the problem of designing a store with those properties minimizing the unused space and the required writes for each modifications and maximizing the speed of each transaction. Another constraint is to allow dead sectors (this does not seem to be implemented into Flash stores) and defragmentation since stores may be rather big (see below). Description of the Format ========================= Each store is made of five parts: A header sector. This sector includes various information about the store. A map of the use of each sector. A table for transactions (can be restricted to a single sector if there isn't enough transactions) A table for reallocation (optional) Data sectors. The Header Sector ----------------- The header provides information concerning the version of the store and various other data. The header format is the following: ULong signature should be 'Stor' (53746F72). ULong format version should be 00000004. ULong length of store in sectors, all sectors. ULong map first sector where the map first sector can be found (usually next sector, i.e. 00000001). ULong transaction table first sector where to find the table for transactions. ULong translation table first sector where to find the table for translations, 0 means there is no translation. ULong separate transaction table first sector where to find the table for separate transactions, 0 means there is no object in separate transaction. ULong root ID ID of the Root object. UShort flags Store flags. These are only written and used if the store isn't in a Newton Store Collection. UShort reserved Should be 0. ULong pool size Number of reserved sectors at the end of the store. The next bytes are reserved for future use (should be 0). The Map Sectors --------------- The Map describes the use of the sectors. It is used to find data and to know how to allocate new objects. The Map sectors are made of 508 bytes defining the use of 508 sectors and an optional next ID for the next map sector. On the last map sector, the next ID should be 0. Each sector is represented by a 8 bits value. 0xFF mapped out (dead sector). If this sector is refered to by an object ID should be translated in the translation sector. 0xFE non data sector (header, map, transaction table or reallocation table) or sector in transaction. 0xFD full or nearly full. 0x01-0xFC size in words 0x00 empty The map includes as many entries as there are data sectors in the store. If the sector is non data, then it is searched in the transaction table and the translation table if not found in the transaction table. If it is in one of these tables, the corresponding sector is used instead. (see below) Otherwise, the sector is considered as non accessible sector. Transaction and Translation Tables Format ----------------------------------------- These tables are in a format called Persistent Map Table which includes couples of IDs (object IDs or sector IDs). If the IDs fit on 3 bytes, the format is the following: UChar IDs[84][6] 84 couples ULong format should be 00000001 ULong next sector ID absolute (apparently, it's not relative to the beginning of the store, I should change this), 00000000 if it's the last sector. If the IDs fit on 4 bytes only, the format is the following: ULong IDs[63][2] 63 couples ULong format should be 00000000 ULong next sector ID The transaction table --------------------- This table is made of couples of sector IDs. The first one indicates a sector currently in transaction, the second one the sector where the copy is. If this ID is 0, there is no copy because the sector was previously empty (I'm not so sure about this, maybe it is if both IDs are equal). [Note: I'm not sure the following lines are still valid, but they do not matter much for the format itself] The transaction works the following way: - if the sector entering in transaction is empty, a new entry in the transaction table as (its ID, 0) is added. - if it is not empty: - a new sector is found and is marked as non data sector. - the sector entering in transaction is copied to this non data sector - a new entry in the transaction table as the sector entering in (transaction's ID, its copy's ID) is added - the sector entering in transaction is marked as non data sector - the sector in transaction is modified during transaction. At the end of the transaction, - the sector in transaction is marked as data sector with its proper size. - the entry is removed from the transaction table - the copy sector if any is marked as free The translation table --------------------- The translation table is used in the case the card is damaged to reallocate blocks. Please note that first versions of the ATA Support software won't support translations and that the use of translated sectors is slow. It is advised to re-format the store if required. The translation table's format is like the transaction table. The first element of the couple is the sector translated from and the second element is the translated new sector. Please note that the new sector is not marked as non data sector. [Note: I think I read the table now, I surely don't write it] The separate transaction table ------------------------------ Unlike the two previous tables, this table is made of object ID couples (instead of sector ID couples). When an object is marked in a separate transaction, the object is copied. I think the work is done on the copy, but I would have to checked. The table includes couples (original, copy) just like the transaction table. If the second object ID is 0, the object is no longer in separate transaction. If second object ID is 1, the object (the copy) has been deleted, and commiting means deleting the original. After a quick look in the code, I'm confident that the original is kept unmodified and the copy is modified. This takes less space on disk but it explains the long process when a package installation is over, so it's not terrific. I should change this and work on the original like with regular transactions, however, there might be some additional reason for not doing so, I have to make tests. Conclusion on transactions: the store shouldn't be unmounted with pending transactions anyway. And if it is, an abort is performed and then it's in a known state. This means deleting every second element of the separate element pairs (except when they're 0 or 1) and copying transaction copy sectors back. Data sectors ------------ Data sectors are made of entries, each entry added one after the other. (this means that everytime a change is made to a sector, the whole sector is garbaged collected to put everything at the beginning, leaving free bytes at the end). An entry is made of: + header (16 or 48 bits) 9 bits: size in bytes (including the header), minus one. The last bit is the MSB, hence the rule to get the size is: theSize = fSizeByte | (fIDandFlagsByte & fSizeMask << 1); ( or, for quicker access with ARM: theSize = fSizeByte; if (fIDandFlagsByte & fSizeMask) theSize |= 0x0200; ) 1 bit: entry is fragmented. 1 bit: entry is a continued object 5 bits: ID if entry is fragmented, there is a fake object ID following (32 bits). Those IDs cannot be referred from the system but are used to locate the following fragment. Each fragment (except the first one) has the continued object bit set to one. + data (any size) The minimum size for an entry is 16 bits (null size) or 48 bits (alias with no data). 0000 doesn't exist because an empty object should be coded 0100. Objects ID ---------- Those are decoded the following way: First 27 bits: sector's number (0 means first data sector) Last 5 bits: ID within sector Therefore, this format supports: 2^5 = 32 entries per sector 2^27 sectors, i.e. 64 GB If object IDs have to be NewtonScript integers, we have: 2^30 IDs hence 2^25 sectors, i.e. 16 GB Various considerations for the implementation ============================================= The driver's goal is to full *real* entries. It tries to keep sectors with either only fake entries or real entries. (so that a defragmentation will be easy to do). There is no need to keep empty aliases in the chain. Aliases should contain data. Aliases are allocated once there is no left room on the sector to grow the entry. If the sector starts with an entry continued data, the sector is not chosen to add data when possible. (therefore to grow an entry, a free sector is used instead). Defragmentation may be immediately followed by a fragmentation from the system if some objects are resized just afterward. Operations with objects larger than 64 kb, although allowed in the format, may or may not be forbidden. There may be some automatic defragmentation from time to time. Minimum size for a store is 4 sectors (i.e. one header, one map, one transaction and one data sector, but this store won't support transactions except if the data sector is empty). ## ======================================================================= ## ## Mouarf ! Evidemment, aucune opŽration QuickDraw n'est jamais accŽlŽrŽe, ## ## et tous les petits pixels jusqu'au dernier sont dessinŽ par le p'tit G3 ## ## ˆ la mimine. ;-) ## ## -+- Ol. in Guide du Macounet Pervers : Carte vidŽo? Kesako? -+- ## ## ======================================================================= ##