Introducing mode Y
	  Overview
	    
			This guide attempts to cover the basics of VGA programming and more specifically mode Y, which is a tweaked mode 13 offering 4 pages of video memory in 320x200 and 256 colors. In other words: a dream come true... well, for the most part. The example code provides replacement for CLS, PSET, POINT, PCOPY and SCREEN (both to set screen mode and flip pages) to use with mode Y.
			
		Mode 13 Memory Addressing
		  
			Mode 13 has been one of QBasic programmers' favorite graphic mode due to its 256 colors and the way pixels are organized in a linear fashion in memory. In order to speed up per-pixel drawing, they began to ditch the built-in PSET instruction in favor of DEF SEG and POKE. Soon, they would write code that looked like this:
			SCREEN 13                          ' Initialize mode 13.
			DEF SEG = &HA000                   ' Video memory segment.
			FOR y% = 0 TO 199                  ' For each line.
			  FOR x% = 0 TO 319                ' For each column.
			    POKE (CLNG(y%) * 320) + x%, 4  ' Write 4 to pixel offset.
			  NEXT x%
			NEXT y%
			DEF SEG
END
			Unfortunately, 16-bit signed INTEGERs cannot reach each 64,000 pixel on the screen because the positive range of that specific data type only goes from 0 to 32,767, and LONG is marginally slower than INTEGER. Thus, some relied on two different segments to divide the screen height in two: &HA000 (which points to the very first pixel on the upper left corner of the screen), and &HA7D0 (which points to the pixel located at column 0, line 100); depending on the Y position of the pixel, they would use one segment or the other, and increase the offset for each subsequent pixel as so:
			
SCREEN 13
			FOR y% = 0 TO 199
			  IF (y% < 101) THEN
			    DEF SEG = &HA000
			    FOR x% = 0 TO 319
			      POKE (y% * 320) + x%, 4
			    NEXT x%
			  ELSE
			    DEF SEG = &HA7D0
			    FOR x% = 0 TO 319
			      POKE ((y% - 100) * 320) + x%, 4
			    NEXT x%
			  END IF
			NEXT y%
END
			How is that possible exactly? Let's have a quick recap about memory segments and memory offsets. In DOS, memory addresses are usually expressed in two pairs of hexadecimal values such as "0000:0000". The left part of the address points to a segment (that's the value used in DEF SEG and returned by VARSEG), while the right part points to an offset (the value used in PEEK or POKE, returned by VARPTR or SADD for variable-length strings). If increasing the offset by one will point one byte further into memory, increasing the segment by one will point sixteen bytes further into memory. In other words, sixteen "offsets" are worth one "segment" (A000:0010 and A001:0000 point to the same memory address.) Thus, a more convenient approach would be:
			SCREEN 13
			DIM clr AS INTEGER
			clr = 4
			FOR y% = 0 TO 199               ' One full line is 320 bytes since
			  DEF SEG = &HA000 + (20 * y%) ' each pixel is stored on its own
			  FOR x% = 0 TO 319             ' offset. Therefore, we have to move
			    POKE x%, clr                ' 320 offsets for each line, or move 
			  NEXT x%                       ' 320 \ 16 offsets = 20 segments.
			NEXT y%
			DEF SEG
END
			This works because in mode 13, each pixel is one byte long and the whole screen is stored as a big contiguous array of 64,000 bytes starting at address A000:0000... or so it seems. In theory, this mode only requires 64K of video memory, although most VGA cards required to support mode 13 have up to 256K available. So the question is: how do we access the "missing" 192K?
		  
		Initializing mode Y
		  
			Mode Y is a tweaked mode 13, so let's start by initializing mode 13 and reprogram some registers. VGA programming involves a lot of talking with ports. Each element of the VGA has an Index and a Data port: the Index port is used to send the function index we want to access and the Data port is used to send or receive information to and from the currently selected function. We're not going to use all of them, but here's the full list anyway:
			' Sequence Controller (SC)
			CONST scINDEX = &H3C4
			CONST scDATA = &H3C5
			CONST scReset = 0
			CONST scClock = 1
			CONST scMapMask = 2
			CONST scCharmap = 3
			CONST scSequencerMemoryMode = 4
			' Graphic Controller (GC)
			CONST gcINDEX = &H3CE
			CONST gcDATA = &H3CF
			CONST gcSetReset = 0
			CONST gcEnableSetReset = 1
			CONST gcColorCompare = 2
			CONST gcDataRotate = 3
			CONST gcReadMapSelect = 4
			CONST gcGraphicsMode = 5
			CONST gcMiscellaneous = 6
			CONST gcColorDontCare = 7
			CONST gcBitMask = 8
			' Cathode Ray Tube Controller (CRTC)
			CONST crtcINDEX2 = &H3D4
			CONST crtcDATA2 = &H3D5
			CONST crtcHorizontalTotal = 0
			CONST crtcEndHorizontalDisplay = 1
			CONST crtcStartHorizontalBlank = 2
			CONST crtcEndHorizontalBlank = 3
			CONST crtcStartHorizontalRetrace = 4
			CONST crtcEndHorizontalRetrace = 5
			CONST crtcVerticalTotal = 6
			CONST crtcOverflow = 7
			CONST crtcPresetRowScan = 8
			CONST crtcMaximumScanLine = 9
			CONST crtcCursorStart = 10
			CONST crtcCursorEnd = 11
			CONST crtcStartAddressHigh = 12
			CONST crtcStartAddressLow = 13
			CONST crtcCursorLocationHigh = 14
			CONST crtcCursorLocationLow = 15
			CONST crtcVerticalRetraceStart = 16
			CONST crtcVerticalRetraceEnd = 17
			CONST crtcVerticalDisplayEnd = 18
			CONST crtcOffset = 19
			CONST crtcUnderlineLocation = 20
			CONST crtcStartVerticalBlank = 21
			CONST crtcEndVerticalBlank = 22
			CONST crtcModeControl = 23
			And now, for initializing mode Y:
			'
' Enter mode Y
'
			SUB ySCREEN()
			  DIM aByte AS INTEGER
			  ' Set standard 320x200x256 (linear)
			  SCREEN 13                                   ' or Interrupt &H10, AX=&H13
			  ' Disable chain-4 mode
			  OUT scINDEX, scSequencerMemoryMode         ' sequencer memory mode
			  aByte = INP(scDATA)                         ' get current value
			  aByte = (aByte XOR (aByte AND 8)) OR 4     ' bit 3 (Chain-4) off, bit 2 (O/E Dis.) on
			  OUT scDATA, aByte                           ' aByte should be &H06
			  ' Turn off odd/even and set write mode 0
			  OUT gcINDEX, gcGraphicsMode                 ' graphic mode
			  aByte = INP(gcDATA)                         ' get current value
			  aByte = (aByte XOR aByte AND (1 + 2 + 16)) ' turn bits 0-1 (write) and 4 (Host O/E) off
			  OUT gcDATA, aByte                           ' aByte should be &H40
			  ' Disable chain
			  OUT gcINDEX, gcMiscellaneous               ' graphic miscellaneous
			  aByte = INP(gcDATA)                         ' get current value
			  aByte = aByte XOR (aByte AND 2)             ' bit 1 (Chain O/E) off
			  OUT gcDATA, aByte                           ' aByte should be &H05
			  ' Disable long mode
			  OUT crtcINDEX2, crtcUnderlineLocation       ' underline location
			  aByte = INP(crtcDATA2)                     ' get current value
			  aByte = aByte XOR (aByte AND 64)           ' turn bit 6 (double-word addressing) off
			  OUT crtcDATA2, aByte                       ' aByte should be &H00
			  ' Enable byte mode
			  OUT crtcINDEX2, crtcModeControl             ' mode control
			  aByte = INP(crtcDATA2)                     ' get current value
			  aByte = aByte OR 64                         ' turn bit 6 (word/byte mode select) on
			  OUT crtcDATA2, aByte                       ' aByte should be &HE3
			END SUB
			Easy, isn't it? Yes it is. And now, on to the bad news: first, mode Y is undocumented and none of QBasic's graphic instructions will work (no CLS, PSET, PRINT, DRAW, LINE, CIRCLE, POINT, GET, PUT or PAINT.) Second, pixels are no longer organized in that easy-to-use mode 13 fashion and are instead distributed across 4 planes.
			
	
		Planes Pain
			
			All 64,000 pixels of each 320x200 video page of mode Y is divided in 4 different planes of 16K each. Pixels are distributed according to their X coordinate (the column they are located in). Plane 0 would access columns 0, 4, 8, 12, 16, etc. Plane 1 would access columns 1, 5, 9, 13, 17, etc. Plane 2 would access columns 2, 6, 10, 14, 18, etc. And plane 3 would access columns 3, 7, 11, 15, 19, etc. To identify the plane a specific pixel belongs to, all we have to do is:
			plane = pixelX MOD 4 ' easy to understand modulo, or...
			plane = pixelX AND 3 ' weird bit masking thingy
			As a direct consequence of mode Y's planar organization, pixels share the exact same memory address four by four: pixels located at (0, 0), (1, 0), (2, 0) and (3, 0) are all located at A000:0000, pixels at (4, 0), (5, 0), (6, 0), (7, 0) are all located at A000:0001, etc. In order to modify a specific pixel, you have to provide the VGA chipset the proper plane to modify and then write to memory as usual. Let's write replacements for PSET, CLS and POINT.
For PSET, we rely on the SC Map Mask Register to POKE information to the right plane. The Map Mask Register allows us to select one, two, three, or all four planes at once by masking them via their bitflag values: 1 for plane 0, 2 for plane 1, 4 for plane 2, and 8 for plane 3 (to select all planes, just do 1 + 2 + 4 + 8 = 15). After selecting the plane we want to work with, we reach to the page memory Segment via DEF SEG and POKE the pixel value. The X coordinate is divided by 4 because four pixels are stored on the exact same address. The same way, while 320 pixels are displayed, only 80 addresses exist for that line. Thus, we move one line down by multiplying the Y coordinate by 80.
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Set pixel on active page
'
			SUB yPSET (x AS INTEGER, y AS INTEGER, c AS INTEGER)
			  OUT scINDEX, scMapMask     ' select plane masking function
			  OUT scDATA, 2 ^ (x AND 3)  ' mask only one plane
			  DEF SEG = yPageSeg         ' select video page
			  POKE (x \ 4 + y * 80), c   ' plot pixel
			  DEF SEG
			END SUB
			Since planes are accessed via bit mask, if we select all 4 planes at once, we only have to write to 16,000 addresses to color all 64,000 pixels of the page! The new CLS routine is basically the same code as PSET, if only a lot simpler:
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Clear active page
'
			SUB yCLS (c AS INTEGER)
			  OUT scINDEX, scMapMask     ' select plane masking function
			  OUT scDATA, &HF            ' select all four planes
			  DEF SEG = yPageSeg         ' select video page
			  FOR i% = 0 TO 15999        ' write to 16,000 addresses
			    POKE i%, c
			  NEXT i%
			  DEF SEG
			END SUB
			The POINT replacement routine is different from PSET because it relies on GC register Read Map Select rather than SC register Map Mask. Also, while it is possible to write to multiple planes at once using a bit mask, it is not possible to read from more than one plane at a time and thus, this function takes the plane index (which is still the X coordinate modulo 4:)
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Read pixel on active page
'
			FUNCTION yPOINT% (x AS INTEGER, y AS INTEGER)
			  OUT gcINDEX, gcReadMapSelect ' select plane reading function
			  OUT gcDATA, x AND 3          ' select plane by index
			  DEF SEG = yPageSeg           ' select video page
			  yPOINT% = PEEK(x \ 4 + y * 80)
			  DEF SEG
			END FUNCTION
			
		Page Operations
			
			In the previous routines, we used a variable called "yPageSeg"; this variable is intended to store the memory segment of the currently active page. This is where things get really interesting: in mode Y, each page is located 1,000 segments, or 16,000 offsets away (the size of one plane) from one another, starting at segment &HA000. At this point, you probably figured out that pages are "stacked" on top of one another, which means that increasing the memory offset by 80 will "push" the currently displayed page one line higher; essentially, you have a huge 320x800 pixels buffer, of which only 320x200 pixels are displayed.
			The first routine we need will change the active page (the one we're currently working on.) It's a very simple routine that provides the memory address of the top left pixel of the video page by multiplying the page index by 1,000 and adding the video memory address (&HA000.) Remember: a whole video page has 64,000 pixels divided in 4 planes; that makes 16,000 unique addresses. A segment is 16 offsets, so each page is 1,000 "segments" apart:
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Change active page
'
			SUB yPAGEWORK (pageNum AS INTEGER)
			  yPageSeg = &HA000 + pageNum * 1000
			END SUB
			The second routine, yPAGEFLIP, changes the display page (the one that the user sees.) This allows double-buffering, a technique where you display a finished page until the next frame is done being rendered, and then switch the two pages. To achieve this, we simply tell the VGA chip to change the display memory offset (where the upper left pixel is located in memory) by using CRTC registers Start Address High and Low:
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Flip to another page
'
			SUB yPAGEFLIP (pageNum AS INTEGER)
			  DIM adr AS STRING
			  adr = LEFT$(MKL$(CLNG(pageNum) * 16000), 2)
			  DO: LOOP WHILE (INP(&H3DA) AND 1)
			  OUT crtcINDEX2, crtcStartAddressHigh
			  OUT crtcDATA2, ASC(RIGHT$(adr, 1))
			  OUT crtcINDEX2, crtcStartAddressLow
			  OUT crtcDATA2, ASC(LEFT$(adr, 1))
			  DO: LOOP UNTIL (INP(&H3DA) AND 8)
			END SUB
			
			Latches
			
			We've seen that there's one register to read pixels (GC Read Map Select) and another to write pixels (SC Map Mask.) This is great because it means we can read and write without swapping register functions. We've also seen that we can apply one value to four different pixels at once. If only there was a way to read more than one pixel and write different values at once, we could copy pixels from one page to the other, four pixels at a time! Well, we can, but this requires some trickery.
			The VGA chipset doesn't just copy and paste information you feed it. Before writing stuff to video memory, the VGA will combine the information with built-in latches. These latches are filled by the content of all four planes located at the last memory address read. For instance, when you PEEK at A000:0001, the latches are automatically filled with pixels located at (4, 0), (5, 0), (6, 0) and (7, 0). What we need to do is program the VGA chipset so we can make a bogus write that will force latches to be copied (unchanged) to another video memory address:
			'
' Copy one page to another
'
			SUB yPAGECOPY (dstPage AS INTEGER, srcPage AS INTEGER)
			  DIM ofs AS LONG
			  OUT scINDEX, scMapMask     ' select plane masking function
			  OUT scDATA, &HF            ' select all four planes
			  OUT gcINDEX, gcBitMask     ' select CPU bit masking function
			  OUT gcDATA, &H0            ' unmask all bits
			  IF (dstPage < srcPage) THEN
			    ofs = CLNG(srcPage - dstPage) * 16000
			    DEF SEG = &HA000 + dstPage * 1000
			    FOR i% = 0 TO 15999
			      POKE i%, PEEK(ofs + i%)
			    NEXT i%
			  ELSE
			    ofs = CLNG(dstPage - srcPage) * 16000
			    DEF SEG = &HA000 + srcPage * 1000
			    FOR i% = 0 TO 15999
			      POKE (ofs + i%), PEEK(i%)
			    NEXT i%
			  END IF
			  OUT gcDATA, &HFF           ' mask all bits
			  DEF SEG
			END SUB
			This code selects all planes (like we did with yCLS), and then sets Graphic register Bit Mask to 0, which will "cancel" the information that we manually send to the VGA so it has to rely on data previously stored in latches. With this configuration, it doesn't matter what we PEEK or POKE, all that matters is to read and write at the proper memory addresses. When the operation is done, we put back the Bit Mask to "all" by setting its value to &HFF. This method could be used to copy portions of one page to the other using latches as long as the width of the section and both the source and target X coordinate is plane-aligned (multiple of 4.)
      
			Print replacement
			
			So far, we still have no replacement for PRINT, so let's get to it. The X and Y values assume a standard 8x8 character matrix (so 40 columns and 25 rows.) We'll be using CALL INTERRUPTX to obtain the memory address where BIOS characters are stored (in theory the address shouldn't change but better safe than sorry.) It's a good example of SC Map Mask function being used to draw multiple pixels at once. The code could easily be optimized (unrolling loops, precaching BIOS characters and reversing their bits once) for programs that need the extra speed:
			'$INCLUDE: 'QB.BI'
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Print with BIOS character map
'
			SUB yPRINT (x AS INTEGER, y AS INTEGER, c AS INTEGER, msg AS STRING)
			  DIM chrSeg AS INTEGER, chrOfs AS INTEGER
			  DIM ofs AS INTEGER, ofs2 AS INTEGER, size AS INTEGER
			  DIM regs AS RegTypeX
			  DIM scn(7) AS INTEGER, iRev(0 TO 15) AS INTEGER
			  ' Lookup table for 4-bit reversal (ie. 0100 -> 0010)
			  iRev(0) = &H0: iRev(1) = &H8: iRev(2) = &H4: iRev(3) = &HC
			  iRev(4) = &H2: iRev(5) = &HA: iRev(6) = &H6: iRev(7) = &HE
			  iRev(8) = &H1: iRev(9) = &H9: iRev(10) = &H5: iRev(11) = &HD
			  iRev(12) = &H3: iRev(13) = &HB: iRev(14) = &H7: iRev(15) = &HF
			  ' Avoid string overflow
			  size = LEN(msg)
			  IF ((size + x) > 40) THEN size = 40 - x
			  ' Get memory location of BIOS font (could be a SHARED variable)
			  regs.ax = &H1130                   ' Get font information
			  regs.bx = &H300                     ' ROM 8x8 double dot font pointer
			  CALL INTERRUPTX(&H10, regs, regs)   ' Do interrupt
			  chrSeg = regs.es                   ' Segment to character map
			  chrOfs = regs.bp                   ' Offset to character map
			  OUT scINDEX, scMAPMASK             ' Select map mask register
			  ofs = x * 2 + y * 640              ' Upper-left corner
			  ' Parse string
			  FOR i% = 1 TO size
			    ' Catch character in BIOS RAM
			    DEF SEG = chrSeg
			    ofs2 = chrOfs + 8 * ASC(MID$(msg, i%, 1))
			    ' Read characters from memory
			    FOR j% = 0 TO 7
			      scn(j%) = PEEK(ofs2 + j%)
			    NEXT j%
			    ' Draw character loop
			    DEF SEG = yPageSeg
			    FOR j% = 0 TO 7
			      OUT scDATA, iRev(scn(j%) \ 16)   ' 4 leftmost pixels
			      POKE ofs + 80 * j%, c
			      OUT scDATA, iRev(scn(j%) AND 15) ' 4 rightmost pixels
			      POKE ofs  + 80 * j% + 1, c
			    NEXT j%
			    ofs = ofs + 2 ' Move 8 pixels to the right
			  NEXT i%
			  DEF SEG ' Select default segment
			END SUB
			
  	Line replacement
			
			And now, for a totally unnecessary and obnoxiously (is that even a real word?) long code, here's a replacement routine for drawing line segments, rectangles and solid rectangles. This function is not fool-proof, so be careful with it. The line drawing part is done with Bresenham's line algorithm, which uses INTEGER and is faster than floating-point calculations.
			CONST isFill = 0 ' Draw a solid rectangle
			CONST isRect = 1 ' Draw a rectangle
			CONST isLine = 2 ' Draw a line
			DIM SHARED yPageSeg AS INTEGER
			yPageSeg = &HA000
			'
' Draw a line, a rectangle, or a solid rectangle
'
			SUB ySHAPE (x1 AS INTEGER, y1 AS INTEGER, x2 AS INTEGER, y2 AS INTEGER, _ 
c AS INTEGER, mode AS INTEGER)
			  DIM xA AS INTEGER, yA AS INTEGER, xB AS INTEGER, yB AS INTEGER
			  DIM xDelta AS INTEGER, yDelta AS INTEGER
			  DIM yInc1 AS INTEGER, yInc2 AS INTEGER
			  DIM det AS INTEGER, dInc1 AS INTEGER, dInc2 AS INTEGER
			  DIM numPixels AS INTEGER, ofs AS INTEGER, msk AS INTEGER
			  DIM horiz AS INTEGER, ofsY AS INTEGER, ofs2 AS INTEGER
			  DEF SEG = yPageSeg
			  OUT scINDEX, scMAPMASK
			  '' DRAW STRAIGHT VERTICAL LINE / DRAW POINT
			  IF (x1 = x2) THEN
			    IF (y1 > y2) THEN
			      ofs = y2 * 80 + x1 \ 4
			    ELSE
			      ofs = y1 * 80 + x1 \ 4
			    END IF
			    OUT scDATA, 2 ^ (x1 AND 3)
			    FOR i% = 0 TO ABS(y2 - y1)
			      POKE ofs, c
			      ofs = ofs + 80
			    NEXT i%
			  '' DRAW STRAIGHT HORIZONTAL LINE
			  ELSEIF (y1 = y2) THEN
			    IF (x1 > x2) THEN xA = x2 ELSE xA = x1
			    ofs = y1 * 80 + xA \ 4
			    numPixels = ABS(x2 - x1) + 1
			    ' Align starting point on plane (dirty)
			    IF (xA AND 3) THEN
			      msk = (xA AND 3)
			      IF (numPixels > 4 - (xA AND 3)) THEN
			        msk = &HF XOR (2 ^ (xA AND 3) - 1)
			      ELSE
			        msk = (2 ^ numPixels - 1) * 2 ^ (xA AND 3)
			      END IF
			      OUT scDATA, msk
			      POKE ofs, c
			      numPixels = numPixels - (4 - (xA AND 3))
			      ofs = ofs + 1
			    END IF
			    ' Draw remaining pixels (starting point is now aligned)
			    IF (numPixels > 3) THEN          ' If there's at least 4 pixels,
			      OUT scDATA, &HF                ' Select all four planes
			      DO
			        POKE ofs, c                  ' Draw four pixels
			        ofs = ofs + 1                ' Move to next stride
			        numPixels = numPixels - 4    ' Remove 4 pixels from count
			      LOOP WHILE (numPixels > 3)     ' Loop until less than 4 remain
			    END IF
			    IF (numPixels) THEN              ' There's less than 4 pixels now.
			      OUT scDATA, 2 ^ numPixels - 1  ' Select all that remains.
			      POKE ofs, c                    ' Draw.
			    END IF
			  '' DRAW LINE
			  ELSEIF (mode = isLine) THEN
			    ' Always draw left to right (easier to handle plane selection)
			    IF (x1 > x2) THEN
			      xB = x1: xA = x2
			      yB = y1: yA = y2
			    ELSE
			      xA = x1: xB = x2
			      yA = y1: yB = y2
			    END IF
			    xDelta = ABS(xB - xA)
			    yDelta = ABS(yB - yA)
			    IF (xDelta > yDelta) THEN
			      numPixels = xDelta
			      dInc1 = yDelta * 2
			      dInc2 = (yDelta - xDelta) * 2
			      det = dInc1 - xDelta
			      horiz = -1
			    ELSE
			      numPixels = yDelta
			      dInc1 = xDelta * 2
			      dInc2 = (xDelta - yDelta) * 2
			      yInc1 = 80
			      det = dInc1 - yDelta
			    END IF
			    IF (yA > yB) THEN
			      yInc1 = -yInc1
			      yInc2 = -80
			    ELSE
			      yInc2 = 80
			    END IF
			    ' initial mask, offset and Y address
			    yA = yA * 80
			    ofs = xA \ 4
			    msk = 2 ^ (xA AND 3)
			    ' mostly horizontal (always adjust plane)
			    IF horiz THEN
			      FOR i% = 0 TO numPixels
			        OUT scDATA, msk
			        POKE ofs + yA, c
			        IF (det < 0) THEN
			          det = det + dInc1
			          yA = yA + yInc1
			          xA = xA + 1
			        ELSE
			          det = det + dInc2
			          xA = xA + 1
			          yA = yA + yInc2
			        END IF
			        IF (msk AND 8) THEN
			          msk = 1
			          ofs = ofs + 1
			        ELSE
			          msk = msk + msk
			        END IF
			      NEXT i%
			    ' mostly vertical (adjust plane only on "split" pixels)
			    ELSE
			      OUT scDATA, msk
			      FOR i% = 0 TO numPixels
			        POKE ofs + yA, c
			        IF (det < 0) THEN
			          det = det + dInc1
			          yA = yA + yInc1
			        ELSE
			          det = det + dInc2
			          xA = xA + 1
			          yA = yA + yInc2
			          IF (msk AND 8) THEN
			            msk = 1
			            ofs = ofs + 1
			          ELSE
			            msk = msk + msk
			          END IF
			          OUT scDATA, msk
			        END IF
			      NEXT i%
			    END IF
			  '' DRAW RECTANGLE / DRAW SOLID RECTANGLE
			  ELSE
			    ' Always make X1 left and X2 right
			    SELECT CASE x1
			      CASE IS > x2
			        xA = x2: xB = x1
			      CASE ELSE
			        xA = x1: xB = x2
			    END SELECT
			    ' Always make Y1 top and Y2 bottom
			    SELECT CASE y1
			      CASE IS > y2
			        yA = y2: yB = y1
			      CASE ELSE
			        yA = y1: yB = y2
			    END SELECT
			    xDelta = ABS(xB - xA) + 1
			    yDelta = ABS(yB - yA)
			    ofsY = yA * 80
			    '' DRAW RECTANGLE
			    IF (mode = isRect) THEN
			      ' left and right lines
			      ofs = ofsY + xA \ 4
			      OUT scDATA, 2 ^ (xA AND 3)
			      FOR j% = 0 TO 1
			        FOR i% = 0 TO yDelta
			          POKE ofs, c
			          ofs = ofs + 80
			        NEXT i%
			        ofs = ofsY + xB \ 4
			        OUT scDATA, 2 ^ (xB AND 3)
			      NEXT j%
			      ' top and bottom lines
			      ofs = ofsY + xA \ 4
			      ofs2 = yB * 80 + xA \ 4
			      ' Align starting point on plane (dirty)
			      IF (xA AND 3) THEN
			        msk = &HF XOR (2 ^ (xA AND 3) - 1)' draw all except offset
			        IF (((4 - (xA AND 3)) - xDelta) = 1) THEN msk = msk XOR 8
			        OUT scDATA, msk
			        POKE ofs, c                  ' top
			        POKE ofs2, c                 ' bottom
			        xDelta = xDelta - (4 - (xA AND 3))
			        ofs = ofs + 1
			        ofs2 = ofs2 + 1
			      END IF
			      ' Draw remaining pixels (starting point is now aligned)
			      IF (xDelta > 3) THEN             ' If there's at least 4 pixels,
			        OUT scDATA, &HF                ' Select all four planes
			        DO
			          POKE ofs, c            ' Draw four pixels (top)
			          POKE ofs2, c           ' Draw four pixels (bottom)
			          ofs = ofs + 1          ' Move to next stride
			          ofs2 = ofs2 + 1        ' Move to next stride
			          xDelta = xDelta - 4    ' Remove 4 pixels from count
			        LOOP WHILE (xDelta > 3)  ' Loop until less than 4 remain
			      END IF
			      IF (xDelta) THEN                ' There's less than 4 pixels now
			        OUT scDATA, (2 ^ xDelta) - 1  ' Select all that remains
			        POKE ofs, c                   ' Draw (top)
			        POKE ofs2, c                  ' Draw (bottom)
			      END IF
			    '' DRAW SOLID RECTANGLE
			    ELSE
			      ofs = ofsY + xA \ 4
			      ' Left edge
			      IF (xA AND 3) THEN
			        msk = &HF XOR (2 ^ (xA AND 3) - 1)' draw all except offset
			        IF (((4 - (xA AND 3)) - xDelta) = 1) THEN msk = msk XOR 8
			        OUT scDATA, msk
			        FOR i% = 0 TO yDelta
			          POKE ofs + i% * 80, c
			        NEXT i%
			        xDelta = xDelta - (4 - (xA AND 3))
			        ofs = ofs + 1
			      END IF
			      ' Center
			      IF (xDelta > 3) THEN           ' If there's at least 4 pixels,
			        OUT scDATA, &HF              ' Select all four planes
			        FOR i% = 0 TO yDelta
			          ofs2 = ofs + i% * 80
			          FOR j% = 0 TO xDelta \ 4 - 1
			            POKE ofs2 + j%, c
			          NEXT j%
			        NEXT i%
			        xDelta = xDelta AND &H3       ' Keep remaining pixels
			        ofs = ofs + xDelta \ 4 + 1
			      END IF
			      ' Right edge
			      IF (xDelta) THEN               ' There are fewer than 4 pixels
			        OUT scDATA, 2 ^ xDelta - 1   ' Select all that remains
			        POKE ofs, c                   ' Draw (top)
			      END IF
			    END IF
			  END IF
			  DEF SEG
			END SUB
		
		Some links you might find useful