Imagine you're stuck in an elevator with an evil giraffe that threatens to blow up Neptune in thirty seven minutes unless you write "174" in Roman numerals, and all you have at your disposal is an old computer with QuickBASIC installed. I know it sounds super far-fetched, but it CAN happen... listen, I'm doing my best to get you invested in the topic at hand, cut me some slack. Anyway, as we have all forgotten (I had to look it up,) here's what Roman numerals look like:
I 1
V 5
X 10
L 50
C 100
D 500
M 1,000
Roman representation is "most significant value first" in the sense that the first numeral (leftmost) has more weight to the value than the last numeral (rightmost.) It's identical to the arabic system we use: for instance, in 109 (a hundred and nine,) the leftmost "1" weights more than the rightmost "9."
The system uses two types of logic: additive and subtractive. When values (read right to left) are increasing (or repeated) numerals, they are added. For instance: XVII is 17, or 1 + 1 + 5 + 10. Note that repeated numerals may only be powers of ten (I, X, C, and M,) and only up to three times. "Ackchually, Wikipedia cites a moderator's friend's blog that claims someone once wrote XXXX and refused to admit it was a mistake, therefore..." frankly my dear I don't give a damn what Wikipedia's contrarian opinion about anything is. I don't care.
When values (again, right to left) are decreasing numerals, they are subtracted. For instance: IX is 9, or 10 - 1. Another subtlety here is that we may only subtract one numeral, it may only be a power of ten (so again, I, X, C, or M,) and it has to be within two ranks of the value it subtracts: X can be subtracted from L and C, but not from D or M. Similarly, I may be subtracted from V and X, but not L and up.
It sounds absolutely weird, but it's a fairly decent way to transcribe finger-counting.
In practice, the smallest value we may represent in Roman numerals is 1 (I) and the largest is 3,999 (MMMCMXCIX.) There's technically a way to write values equal to or greater than 4,000 using parenthesis, however that's a very niche usage and I can't see anyone using Roman numerals for anything but years and chapter enumerations nowadays... Right now, what we really want is a way to easily convert digits from Arabic numerals to Roman numerals. Let's begin with values 1 to 9.
I 1
II 2
III 3
IV 4
V 5
VI 6
VII 7
VIII 8
IX 9
How does it work exactly? Well, 1, 2, and 3 are obtained by repeating "I" up to 3 times. 5 is easy as we just write "V." 6, 7, and 8 are obtained by starting at 5, and then repeating "I" up to 3 times. We're left with two numbers: 4 and 9. 4 is written "IV," meaning that we take 5 and subtract 1. 9 is written "IX," meaning that we take 10 and subtract 1. If we had to port that to QuickBASIC, we would write something along the lines of:
DIM outStr AS STRING
DIM digit AS INTEGER
SELECT CASE (digit)
CASE 1 TO 3
outStr = outStr + STRING$("I", digit) ' Repeat I from 1 to 3 times
CASE 4
outStr = outStr + "IV" ' Write as-is for now
CASE 5 TO 8
outStr = outStr + "V" + STRING$("I", digit - 5) ' Repeat I from 0 to 3 times
CASE 9
outStr = outStr + "IX"
END SELECT
The code above will work for any value between 1 and 9. But what if we want to go further? First, we're going to write the entirety of the Roman numerals into a string by increasing value:
CONST romNum = "IVXLCDM"
We notice that every other character is a power of ten: I is 1, X is 10, C is 100, M is 1,000. Similarly, V is 5, L is 50, etc. So every time we increment our reading offset in the Roman numeral list by two, we basically multiply our Arabic numeral by 10: 4 is written "IV" (4 * 10^0 = 4) if we start reading romNum at 1, it becomes "XL" (4 * 10^1 = 40) if we start reading romNum at 3, and becomes "CD" (4 * 10^2 = 400) if we start reading romNum at 5. So, if we can isolate each decimal in Arabic representation and determine their rank, then we could shift our reading offset accordingly!
DIM srcNum AS INTEGER
DIM rank AS INTEGER
srcNum = 109
' Keep going until we have no more digits to process.
WHILE (srcNum)
PRINT "Rank:"; rank, (srcNum MOD 10) ' Print rightmost decimal and its rank.
srcNum = srcNum \ 10 ' Shift digits once to the right.
rank = rank + 1 ' Increment rank.
WEND
' The output is:
' Rank: 0 9
' Rank: 1 0
' Rank: 2 1
Neat! When we combine the reading offset trick and the decimal isolation, here's what we get:
''
'' CONVERT ARABIC NUMERALS TO ROMAN NUMERALS
''
FUNCTION ArabicToRoman$ (value AS INTEGER)
CONST romNum = "IVXLCDM"
DIM srcNum AS INTEGER, digit AS INTEGER
DIM rank AS INTEGER, romOfs AS INTEGER
DIM outStr AS STRING
' This implementation only works for values between 1 and 3,999.
IF ((value < 1) OR (value > 3999)) THEN
EXIT FUNCTION
END IF
' Backup input.
srcNum = value
' Keep going until we have no more digit to process.
WHILE (srcNum)
digit = srcNum MOD 10 ' Isolate rightmost decimal.
romOfs = 1 + (rank * 2) ' Reading offset in romNum.
SELECT CASE (digit)
CASE 1 TO 3
' Repeat base up to three times.
outStr = STRING$(digit, MID$(romNum, romOfs, 1)) + outStr
CASE 4
' Next multiple of five, decrement with base.
outStr = MID$(romNum, romOfs, 1) + MID$(romNum, romOfs + 1, 1) + outStr
CASE 5 TO 8
' Next multiple of five, repeat base between 0 and 3 times.
outStr = MID$(romNum, romOfs + 1, 1) + STRING$(digit - 5, MID$(romNum, romOfs, 1)) + outStr
CASE 9
' Next power of ten, decrement with base.
outStr = MID$(romNum, romOfs, 1) + MID$(romNum, romOfs + 2, 1) + outStr
END SELECT
srcNum = srcNum \ 10 ' Shift digits to the right, once.
rank = rank + 1 ' Increment rank.
WEND
' Return string.
ArabicToRoman$ = outStr
END FUNCTION
Thus, "174" is "CLXXIV" (5 - 1 + 10 + 10 + 50 + 100!) Take that, you evil giraffe!
Believe it or not, converting Roman to Arabic numerals is about as easy! In practice, all we have to do is read the string right to left, convert the Roman notation to a value and compare with the previous value we processed. If the current value is smaller than the previous value, we decrement the count. If the current value is greater or equal to the previous value, we increment the count.
''
'' CONVERT ROMAN NUMERALS TO ARABIC NUMERALS (version 1)
''
FUNCTION RomanToArabic% (source AS STRING)
CONST romNum = "IVXLCDM"
DIM value AS INTEGER
DIM thisInt AS INTEGER, prevInt AS INTEGER
DIM thisOfs AS INTEGER
FOR i% = LEN(source) TO 1 STEP -1
' Find the Roman numeral in the list
thisOfs = INSTR(romNum, MID$(source, i%, 1)) - 1
IF (thisOfs = -1) THEN
value = -1
EXIT FOR
END IF
' Even indices are powers of ten, odd indices are also multiplied by 5.
IF (thisOfs AND &H1) THEN
thisInt = 5 * 10 ^ (thisOfs \ 2)
ELSE
thisInt = 10 ^ (thisOfs \ 2)
END IF
' Increment or decrement, according to previous value.
IF (thisInt < prevInt) THEN
value = value - thisInt
ELSE
value = value + thisInt
END IF
thisInt = thisInt
NEXT i%
' Return value
RomanToArabic% = value
END FUNCTION
And that's it... well, it is as long as we're not interested in string validation. The function above will decode incorrect inputs such as IIII (4,) VV (10,) or IXIX (18.) Some rules should be put in place to prevent that behavior. The following function should catch invalid strings (in which case it returns -1.)
''
'' CONVERT ROMAN NUMERALS TO ARABIC NUMERALS (version 2)
''
FUNCTION RomanToArabic% (source AS STRING)
CONST romNum = "IVXLCDM"
DIM value AS INTEGER
DIM thisInt AS INTEGER, prevInt AS INTEGER
DIM thisOfs AS INTEGER, prevOfs AS INTEGER
DIM streak AS INTEGER, didSub AS INTEGER
prevOfs = -1
' Assume the string is invalid.
RomanToArabic% = -1
FOR i% = LEN(source) TO 1 STEP -1
' Find the Roman numeral in the list
thisOfs = INSTR(romNum, MID$(source, i%, 1)) - 1
' INVALID: not a Roman numeral.
IF (thisOfs = -1) THEN EXIT FUNCTION
IF (thisOfs = prevOfs) THEN
streak = streak + 1
' INVALID: more than 3 times the same Roman numeral in a row.
IF (streak > 3) THEN EXIT FUNCTION
' INVALID: V, L, D twice in a row.
IF ((streak > 1) AND (thisOfs AND &H1)) THEN EXIT FUNCTION
ELSE
streak = 1
END IF
' Even indices are powers of ten, odd indices are also multiplied by 5.
IF (thisOfs AND &H1) THEN
thisInt = 5 * 10 ^ (thisOfs \ 2)
ELSE
thisInt = 10 ^ (thisOfs \ 2)
END IF
' Increment or decrement, according to previous value.
IF (thisInt < prevInt) THEN
' INVALID: we already subtracted.
IF (didSub) THEN EXIT FUNCTION
' INVALID: trying to subtract Roman numeral V, L, or D.
IF (thisOfs AND &H1) THEN EXIT FUNCTION
' INVALID: subtractor must be within two ranks.
IF ((prevOfs - thisOfs) > 2) THEN EXIT FUNCTION
value = value - thisInt
didSub = -1
ELSE
value = value + thisInt
prevInt = thisInt
didSub = 0
END IF
prevOfs = thisOfs
NEXT i%
' Return value
RomanToArabic% = value
END FUNCTION
That's it. It's sloppy, but it works.