Articles   Members Online:
-Article/Tip Search
-News Group Search over 21 Million news group articles.
-Delphi/Pascal
-CBuilder/C++
-C#Builder/C#
-JBuilder/Java
-Kylix
Member Area
-Home
-Account Center
-Top 10 NEW!!
-Submit Article/Tip
-Forums Upgraded!!
-My Articles
-Edit Information
-Login/Logout
-Become a Member
-Why sign up!
-Newsletter
-Chat Online!
-Indexes NEW!!
Employment
-Build your resume
-Find a job
-Post a job
-Resume Search
Contacts
-Contacts
-Feedbacks
-Link to us
-Privacy/Disclaimer
Embarcadero
Visit Embarcadero
Embarcadero Community
JEDI
Links
How to encrypt files using many prime numbers Part 1 of 4 Turn on/off line numbers in source code. Switch to Orginial background IDE or DSP color Comment or reply to this aritlce/tip for discussion. Bookmark this article to my favorite article(s). Print this article
A proposed new algorithm 25-Oct-04
Category
Algorithm
Language
CBuilder All Versions
Views
662
User Rating
8
# Votes
1
Replies
0
Publisher:
Nemitz, Vernon
Reference URL:
			1   // CRYPTION.C, A Data Protection Tool,
2   // Copyright (C) 2004 by Vernon Nemitz
3   // Originally published at www.devsuperpage.com;
4   // seek the original there if needed.
5   //
6   //     Publishing rights are shared with the General Public
7   //     only as follows:
8   // (1) This copyright description, including the notices given
9   //     by any other author, must be entirely retained as the
10  //     first thing in every copy of this document; its intent
11  //     and meaning may not be altered by anyone.
12  // (2) All of the content of this document, as originally
13  //     acquired by any person, must be retained in all copies
14  //     made by that person.  However, brief passages may be
15  //     quoted in reviews and other descriptions.  And anyone
16  //     is free to clone this copyright description by itself
17  //     to a new document, and become the original author of
18  //     the rest of that document.
19  // (3) Radical editing of this document is allowed, but only
20  //     of the parts that follow this copyright description.
21  //     The way to edit this document while retaining prior
22  //     work is to mark/disable any chosen portion as being
23  //     inadequate and/or unclear and/or obsolete and/or
24  //     [describe!] -- and then, likely, to insert and/or
25  //     append new/replacement material.
26  // (4) Each who edits this document must clearly identify the
27  //     work done, and is expected to add a brief note and
28  //     copyright declaration, similar to the lines for
29  //     original author(s) above, in the space provided below.
30  // (5) No one is required to republish this document after
31  //     editing it, but if that is done, then this copyright
32  //     notice will continue to apply to it.
33  //
34  //
35  //
36  // End of copyright information, for this Evolving
37  // Multi-Owner Document.
38  
39  // Notice:  This document has been edited from the original
40  // format, to a narrower column width.  No significant other
41  // changes have been made (besides adding this Notice, and
42  // fixing the odd typo or comment have been made -- and the
43  // original author, who made those changes, did not think it
44  // necessary to disable the entire program (wider form) and
45  // append a near-duplicate (narrower form).
46  
47  /*
48  The purpose of this program is to encrypt any specified file easily
49  and fairly quickly, and also to reverse the process equally easily
50  -- yet use an encryption scheme so potent that even a quantum
51  supercomputer would have trouble breaking it.  Of course time will
52  tell; there are no powerful quantum computers at this writing.
53  
54  Note that that is the PURPOSE of this program.  It DOES modify files,
55  and it does successfully unmodify those files, but the algorithms by
56  which it does its work have not at this writing been subjected to
57  intense scrutiny by Experts In The Field.  Perhaps there is a fatal
58  flaw that is unknown to the original author; if so, the security that
59  the program is intended to provide would just be an illusion.  On the
60  other hand, perhaps this program really is as good as has been
61  attempted.  One reason for publishing it is to let the Experts analyze
62  it, so that the Truth will become known.  Meanwhile, those risk-takers
63  who gamble on its security can at least be assured that their data
64  will be protected UNTIL a flaw is discovered -- which may take a
65  while.
66  
67  The algorithms invoke a unique property of prime numbers, which the
68  original author happened to independently discover, and then publish
69  in the "Journal of Recreational Mathematics", Volume 14, Number 2
70  (1981-1982), pg 141.  The reciprocal of prime P in at least one
71  arithmetic Base (and possibly in an infinite quantity of Bases) will
72  be a repeating series of digits having a length or "period" of P-1.
73  For example, in the common Base Ten, if P is the prime number 7 and
74  its reciprocal is computed, the period is P-1 or 6 digits (repeating
75  142857).  And although when P is the equally prime number 3, its
76  reciprocal in Base Ten only has a period of 1 digit (repeating 3), in
77  Base Two (and others) its period is indeed P-1 or 2 digits (repeating
78  01 in Base Two).  Meanwhile, for every Composite number C, its
79  reciprocal always has a period that is less than C-1 digits, no matter
80  in what Base it is calculated.
81  
82  When examining the digits of a single P-1 period for a prime number,
83  some order may be detected.  Ignoring 2, primes are odd numbers, so
84  their longest periods have even quantities of digits.  It is
85  interesting that if the first half of the longest period is added to
86  the second half, the result is a series of identical digits, each
87  being B-1, where B is the Base in which the reciprocal had been taken.
88  For example, look at the 142857 from 7 and Base Ten above; split and
89  add: 142 + 857 = 999.  Other than that, though, the digits in the
90  first half of the period seem to be reasonably pseudo-random.  This is
91  good because there are lots of prime numbers; we could simply take the
92  reciprocal of some prime larger than 100 million, and obtain 50
93  million pseudo-random digits quite easily and repeatably.
94  
95  One simple way to obtain greater security, however, is to divide that
96  prime into some other number than One (a reciprocal is defined by
97  division into One).  Any pseudo-random number smaller than the prime
98  will do (if bigger than the prime, only the division-remainder would
99  be relevant to these considerations of period-properties).  The net
100 effect is that the same repeating sequence of digits is generated as
101 if the reciprocal of the prime had been computed, but the sequence
102 starts in a different place.  So, if the sequence has a length of
103 millions, then making its starting point part of the secret encryption
104 key does indeed increase security somewhat.
105 
106 Yet there is more.  If multiple large primes are chosen, each able to
107 yield millions of pseudo-random digits, then those sets of digits can
108 be combined (effectively increasing randomness) to make the actual
109 encrypting-numbers for a message.  Also, if a message to be encrypted
110 is only a few thousand bytes long, then aren't vast quantities of
111 potential encrypting-digits being "wasted"?  What if, instead of
112 using generated digits as soon as they are available, some were
113 deliberately skipped first?  Depending on the way those digits were
114 created from more than one prime, this can be a far more devious
115 thing, than just another variation on the theme of "divide into a
116 different number than One".  In fact, consider the notion of skipping
117 a (semirandom) few digits inside the heart of the algorithm, for every
118 character of any processed message....  Finally, what if you had
119 several files to encrypt, and you could encrypt all of them at once,
120 with the same key, using blocks of pseudo-random digits that were
121 separated by irregular quantities of unused digits?  With this
122 CRYPTION program, you can do exactly that!
123 
124 The most perfect encryption system uses totally random numbers to
125 modify the content of a message.  With totally random numbers, the
126 message cannot be cracked even in theory, no matter how powerful a
127 quantum supercomputer is used.  In practice, though, while totally
128 random numbers appear to be obtainable for encryption, the message
129 must still be decryptable by the permitted parties -- and for that it
130 is necessary to use the exact same sequence of totally random numbers.
131 But guaranteed-identical sequences of totally random numbers have in
132 the past been essentially impossible to independently obtain at the
133 separate locations of sender and receiver (it would be a violation of
134 the definition of "totally random"!).  However, modern quantum
135 encryption techniques can use the phenomenon known as "entanglement"
136 to allow completely secure transmission/sharing of totally random
137 numbers, which are generated just ONCE -- but this trick is just
138 barely coming out of the laboratory with a high price tag, so it
139 cannot quite be called ready for prime time.  Thus, until then, for
140 totally random numbers to be used in encryption, the list of numbers
141 has to be carried about by the message-recipient, which represents a
142 security risk.
143 
144 Meanwhile, the repeatable generation of pseudo-random numbers means
145 that they can be independently and identically computed at separate
146 locations.  Their weakness with respect to encryption is, of course,
147 the fact that they are not totally random.  As noted above, however,
148 there are ways of mixing pseudorandom numbers together, to make the
149 result more closely approach the totally random condition.  Note that
150 such procedures can be computationally intensive, and thus detract
151 from ease and speed of use.  The particular mixing scheme employed
152 here takes advantage of the facts that every large prime number offers
153 an easy way to compute a lot of pseudo-random digits, that there are
154 infinitely many prime numbers, and that programs can often "trade"
155 memory for speed (in this case, it uses a lot of memory to hold a lot
156 of calculation scratchpads, and its main algorithm runs comfortably
157 fast enough).
158 
159 Digression:  There is a convenience problem that is directly
160 associated with encrypting a message using a lot of prime numbers.
161 Overall, the list of those primes -- and the numbers which they will
162 be dividing -- constitute the "encryption key", the information needed
163 to decrypt a message later.  Who is going to remember such a key?  And
164 writing it down leads to the same security risk as already mentioned.
165 The chosen solution is to obtain the list of primes from an
166 encipherable source....
167 
168 Enciphering is not the same as encryption.  Ciphering represents a
169 larger concept, of which encryption is just a subset.  Thus ciphers
170 can use random ideas to hide other ideas, while encryption is
171 thoroughly mathematical these days.  So, an example pre-arranged
172 cipher might be: "If I should happen to use the word 'Friday' in the
173 next conversation, you will stop listening at the door and call the
174 fire department."  The context of a cipher can be as important as the
175 cipher itself!
176 
177 So, let us consider a List Of Only/All Prime Numbers Smaller Than One
178 Billion (for example).  There are many millions of primes on that
179 List.  If we choose to use one thousand of them in combination to
180 encrypt a message, we can decide to specify those primes either by
181 exact value, or by position in the List.  The difference is quite
182 significant.  Primes are particular numbers, not always easy to
183 identify (especially when large), while mere positions on a list are
184 ordinary numbers.  And ordinary numbers can be obtained anywhere!
185 (So, how many ways are there to select a thousand primes, to be used
186 as an encryption key, from many-millions?  It is a large enough number
187 to give pause to any would-be code-breaker!  And we can easily make
188 message-cracking tougher by choosing a longer Main List of primes
189 -- CRYPTION.EXE uses more than 200 Million -- and also by selecting
190 more than a mere thousand of them -- CRYPTION.EXE can select 65
191 thousand.)
192 
193 Now please consider an already existing file of some sort, perhaps an
194 ".MP3" music file.  Examined from the currently relevant perspective,
195 this file is just a collection of numbers that represent music-data.
196 Better, it is compressed data, which means that most of the
197 repetitiveness of the data in that file has been squeezed out --
198 leaving mostly random numbers behind!  Suppose we use this modestly
199 random collection of ordinary numbers as a way to select primes from
200 the List of Only/All Primes -- with the length of that file indicating
201 how many to select?  That means the file is the key to encrypting/
202 decrypting the message, and easy to remember!  More, you can now
203 encrypt the message and pass it on, and at some unrelated (even
204 earlier!) point in time, mention to the recipients a particular .MP3
205 file that they should hear.  This enciphering of the key makes
206 cracking the message at their end easy, but hard for anyone else.
207 
208 Keep in mind that there are lots of other kinds of compressed data
209 files that can be used as encryption keys in this scheme.  Nor do the
210 key files have to be compressed, because the overall algorithm only
211 indirectly uses the data in the key files (it is pulled,
212 pseudorandomized, and THEN used to select primes!).  Nevertheless,
213 you and your pals could consider owning identical copies of some
214 particular version of a data-compression program, such that you know
215 that any file that might get compressed by the program will be
216 compressed identically.  Then you could plan some sneaky thing like
217 saying, "Go to this Web address ...", which means to do exactly that.
218 But it also means, "Save that Web page as a local file, compress the
219 file, and then use it as the decryption key."  (Do note that date/time
220 stamps may interfere with that notion's simplicity -- but then, this
221 will depend on the "particular version" of the compression program.)
222 If you are concerned about the Web page being modified before it can
223 be used as a decryption key, then try www.archive.org -- their job is
224 to archive billions of Web pages, with the goal of preserving them
225 just as they were.   Anyway, be creative with your enciphering of the
226 encryption key!  Even encrypted files can be used as the keys to other
227 encrypted files!  And, suppose you occasionally put the decryption
228 information for your next message -- including the Key File! --
229 somewhere inside a zipped-up current message, just to provide someone
230 monitoring your communications with not even a hint regarding anS
231 enciphered key....  Digression ends.
232 
233 This CRYPTION program will request a file-name to use as the key, and
234 also a list of files to be encrypted or decrypted, and other
235 information such as skip-numbers.  Only a minimal check will be made
236 to find some random data in the key file, so you have many choices --
237 and no record at all will be made of the relationship between the key
238 file and the message file(s), because such a record is still a
239 security risk.  If the user wishes to create a personal list of "keys
240 and files", and encrypt THAT, so that its encryption key is the only
241 thing to remember, fine.  All this program will do is encrypt/decrypt.
242 Managing any other details is not its job.
243 
244 The reason for taking that position here is, such a list can
245 potentially be complicated enough to require a special computer
246 program of its own, just to manage it.  After all, just because a file
247 has been encrypted, that does not mean someone might never want to
248 encrypt it again, for additional security, with either the same key
249 or another key.  More, it is important to realize that simply because
250 the decryption process is the exact inverse of the encryption process,
251 then this program's decryption process could be used INSTEAD of this
252 program's encryption process, to scramble a message, after which the
253 encryption process would be needed to unscramble it!  When you
254 consider that a message might be scrambled multiple times in alternate
255 ways using multiple keys, then you might agree that keeping track of
256 all the possibilities (and the correct overall unscrambling sequence)
257 is beyond the scope of the present program, a simple "encryption/
258 decryption engine".
259 
260 One minor dilemma results from the fact that any scrambled file will
261 of course no longer be worthy of its original file extension (like
262 .TXT).  All scrambled files will consist of simple binary data, and
263 any program that expects something else (the format of the original
264 file) will likely claim that the file has become corrupted.  Therefore
265 it is logical that the newly created scrambled file be given a new
266 file extension, a unique group of letters that no other program will
267 claim an ability to properly digest for presentation to the user.  The
268 chosen replacement extension is the first three consonents of this
269 program's name: .CRP, which coincidently lets the file now brag about
270 its own "corruption" (or even "crappiness") as far as its use by any
271 other program is concerned....
272 
273 Next, should some special program be written to manage files and keys,
274 that program might call this one with a long string of "parameters",
275 and thereby pass the file-name and other information that the CRYPTION
276 program needs, and bypass the requests it would otherwise make.  The
277 user of such an overall controller program would be convenienced by
278 that arrangement.  Therefore, formally, when the CRYPTION program is
279 invoked by another, via a command to the computer's Operating System,
280 the list of parameters must be as follows:  (1) A required name of the
281 file, inside quotation marks, to be used as a key (Drive and Path
282 optional); (2) An optional letter D, to delete the key file after all
283 work is done -- the default is to retain the file; (3) A required skip
284 number, though 0 (zero) is okay; (4) A required algorithm code-letter,
285 either E for Encrypt or D for Decrypt; (5) The required name of a file
286 to be affected by that algorithm, again enclosed by quotes; (6) An
287 optional letter D to delete that file after all work is done -- this
288 letter is only relevant for files that do not have a .CRP extension,
289 because any .CRP file is assumed to be scrambled and does not need to
290 persist beyond its time of intermediacy (so a further-scrambled .CRP
291 file always REPLACES the prior .CRP file, and a finally-unscrambled
292 .CRP file will have its file-extension changed).  The last four
293 parameters may be repeated as a group several times, allowing one key
294 to be used to affect many files.  Here is an example (one repetition):
295 
296 CRYPTION.EXE
297   "C:\KEY FILE.BIN"2222E"C:\TMP\MESSAGE1.TXT"7654D"MSG2.TXT"D
298 
299 Above, Values 2222 and 7654 are skip numbers; the file MESSAGE1.TXT
300 will be processed via Encryption algorithm, and the MSG2.TXT file via
301 Decryption; and lastly, only MSG2.TXT will be deleted, and then only
302 if all work has been successfully completed.  (NOTE: skip numbers must
303 be positive, and if they are too large -- say, more than a million --
304 they will significantly slow down the CRYPTION program.  Thus the
305 maximum allowed skip number is 9999999, just a trifle under ten
306 million).  Files named in that example may be any other thing
307 appropriate, of course, and if the drive and directory path are not
308 included, the CRYPTION program will let the Operating System seek it
309 in the usual default places.  Ordinary quotation-mark  "  symbols are
310 required to guarantee correct parsing of file names from the
311 parameters, especially when they contain spaces, such as KEY FILE.BIN
312 in the example.  (Note that  "  symbols are not allowed, by the
313 Operating System, to be part of a file name, which means quotes are
314 perfect markers for the start and end of a path\file name.)  Next, the
315 example shows minimum spaces used as item separators, but additional
316 spaces are quite acceptable (except that quote marks must exactly
317 delimit file names).  IMPORTANT:  The length limit for any one
318 DRIVE:\PATH\FILENAME.EXT is 48 characters.  Equally important:  If
319 parameters are supplied to the CRYPTION program when the Operating
320 System is told to run it, those parameters will be tested for
321 validity.  Anything missing or incorrect will cause a specific request
322 for needed information, just as if the CRYPTION program had been
323 invoked with no parameters.  Thus an automated controller program that
324 invokes CRYPTION must be certain it passes valid data in the
325 parameters, because otherwise this program will sit and wait until it
326 gets human input.
327 
328 Next, another and much more serious dilemma also flows from the
329 possibility that some file might be scrambled multiple times, and that
330 the Decryption option might be used to Encrypt it.  If this program is
331 going to replace the original file extension with another, how is it
332 going to recognize that an original file has been restored, so that
333 the original extension can be restored?  Only a very iron-clad policy
334 with respect to file-manipulation is going to solve the dilemma, and
335 this is it:
336  1. Regardless of encryption/decryption mode, this program will always
337     pay attention to the file's current extension, and make a quick
338     examination of the start of the file.
339  2. If the original file is about to receive its initial scrambling,
340     then its extension will not be .CRP, and this program will expect
341     to NOT find a certain key phrase at the start of the file.  In
342     other words, finding that phrase will imply that the file
343     currently exists in a scrambled state; not finding that phrase
344     means that it will be inserted at the beginning of the file.
345  3. When this program inserts the key phrase to the beginning of the
346     file, it will place it there TWICE, along with an additional
347     phrase that holds the original file extension.  For example, if
348     the extension was TXT, then this is what gets added to the start
349     of the to-be-scrambled file:
350       DATA PROTECTED FROM SNOOPERCOMPUTERS WITH CRYPTIONITE:
351       Data Protected From Snoopercomputers With Cryptionite:
352       The Original File Extension is TXT.
353  4. The preceding will be one long string of text, with two spaces
354     after each part of it, including the third part (it's not the
355     multi-line thing just presented).  The long message has two
356     purposes.  First, no existing file nor future file has any need to
357     begin with that exact long sequence of characters, UNLESS it has
358     been altered by CRYPTION.EXE.  Second, the last two parts of that
359     text will be scrambled along with the rest of the file.  When that
360     text reappears after this program has performed either encryption
361     or decryption, then this program can make the reasonable
362     assumption that the rest of the original file has also been
363     restored.
364  5. At that point CRYPTION will take the just-revealed file extension
365     and apply it -- and also completely remove the special text.  Note
366     that the exact identifier just described is 87 characters from
367     "Data Protected " to " Extension is ".  Also note that each
368     character is normally stored in one "byte" of memory, and that a
369     byte can hold any of 256 possible characters.  Thus, for this
370     particular sequence of characters to be generated at random, the
371     odds are 1 in 256-raised-to-the-87th-power!  This is the basis
372     behind the reasonable assumption that when the last two parts of
373     the inserted text appear, the original file is restored.
374  6. When multiple scramblings are performed, this will be indicated by
375     the first (capitalized) part of the specially-inserted text
376     (including two spaces after the colon).  When it is present,
377     nothing will be done to the start of the file, but everything
378     after that phrase will have either Encryption or the inverse
379     (Decryption) applied to it.  Of course, after every such process,
380     this program will check to see if the rest of the original
381     inserted text has reappeared, as already described.
382  7. Now, will embedding such a plainly-stated phrase make it easier
383     for those wanting to unscramble a message, against whom the
384     message was encrypted in the first place?  Perhaps.  Nevertheless,
385     it will remain true that thousands of prime numbers, chosen with
386     semi-randomness from more than 200 million, must be exactly
387     identified BEFORE even that initial identifier can be correctly
388     unscrambled, to say nothing of the rest of the message.  With
389     200-million-raised-to-the-thousandth-power as the MINIMUM number
390     ways to encrypt a message, by the time they can build big enough
391     quantum supercomputers to crack the message, not only will the
392     temporal relevance of the message have passed, but by then the
393     truly-unbreakable quantum encryption techniques will be in wide
394     use.
395 WARNING: Any other program that modifies any aspect of the preceding
396 may cause an inability to restore the original file!  (Of course, if
397 you really know what you are doing, you might acquire even greater
398 security for your messages, instead.  But if you goof, don't say you
399 weren't warned!)
400 
401 A variant dilemma will be discovered by anyone who attempts to do
402 this:
403   CRYPTION.EXE
404     "C:\KEY.BIN" 905 E "C:\MESSAGE.TXT" 823 D "C:\MESSAGE.CRP"
405 which attempts to first apply the encryption algorithm to MESSAGE.TXT
406 -- and which should successfully create a file called MESSAGE.CRP (the
407 newly created file is always located in the same directory as the
408 original file).  However, using a completely different group of
409 pseudorandom numbers derived from the key file (thanks to that initial
410 encryption effort and the skip numbers), the above example specifies
411 applying the decryption algorithm to the just-created file.  The
412 dilemma is NOT the fact that MESSAGE.CRP will not exist at the time
413 the CRYPTION program checks for valid parameters; the program has code
414 in it designed to handle this situation -- but you do have to specify
415 an IDENTICAL path for the second file.  No, the dilemma concerns
416 unscrambling the file! --which requires separate stages.  Example:
417   CRYPTION.EXE
418     "C:\KEY.BIN" 905 E "C:\GARBAGE.CRP" 823 E "C:\MESSAGE.CRP"
419 where GARBAGE.CRP is simply a disposable copy of MESSAGE.CRP -- and
420 the second stage:
421   CRYPTION.EXE "C:\KEY.BIN" 905 D "C:\MESSAGE.CRP"
422 
423 See, the algorithms quickly generate, use, and FORGET pseudorandom
424 numbers (which for one thing is why the skip numbers cannot be
425 negative numbers).  The algorithms are not "commutative", in which the
426 order of "Encryptions" and "Decryptions" is unimportant -- that order
427 IS important!  So the second scrambling of MESSAGE.TXT has to be
428 undone first, which naturally requires getting at the precise place in
429 the sequence of pseudorandom numbers where the second part began, of
430 the original scrambling.  Finding that location involves the length of
431 the original file MESSAGE.TXT, which was modified by adding the
432 identifier-text described earlier -- including adding the file
433 extension, which might not be 3 characters!  But by design, the
434 initial AND any subsequent scramblings of a file always mess with the
435 same amount of added-on text.  This is what makes it possible to copy
436 MESSAGE.CRP to the new file GARBAGE.CRP, and use it as a "filler", as
437 if its length was just a skip number.  Note that either Encryption or
438 Decryption can be specified for GARBAGE.CRP, because the result should
439 be discarded.  Also, recall that CRYPTION.EXE does not change the
440 filename GARBAGE.CRP, so specifying a D (before the 823) in the above
441 unscramble-example is useless.  The CRYPTION program already
442 automatically deletes "original" .CRP files, by overwriting them with
443 the do-not-delete "resulting" .CRP files.  Thus you would have to
444 manually delete GARBAGE.CRP sometime after that first
445 unscramble-stage.
446 
447 There IS an alternative to using a copied/renamed file like
448 GARBAGE.CRP.  If you know the exact length of the .CRP file, as
449 measured in bytes (let's pretend here it is 12345), then a grand Skip
450 Number CAN be computed for the first unscrambling stage.  Since all
451 .CRP files will begin with a special capitalized identifier phrase (56
452 bytes long) that is never scrambled, the portion of the file that IS
453 scrambled is always 56 less than the total length (12345-56=12289 in
454 this pretending).  So, add that computed 12289 to all other
455 appropriate Skip Numbers (905 and 823 above), and the first
456 unscrambling stage is:
457   CRYPTION.EXE "C:\KEY.BIN" 14017 E "C:\MESSAGE.CRP"
458 
459 Well, from the preceding, you may imagine that a special program that
460 keeps track of scrambling-data is going to need to be intelligent
461 enough to deal with the dilemma just described.  (However, that
462 computation based on file-length would be a very easy thing for it to
463 do.)  And due to the number of data items about which it will need to
464 record (although, remember, some parameters ARE optional), users of
465 CRYPTION.EXE are going to want it sooner, not later.  The original
466 author DOES have tentative plans to develop a controller program, but
467 getting the CRYPTION program working correctly and "out there" has
468 greater priority -- and due to other commitments, the author would be
469 neither surprised nor complaining if someone else wrote it first.
470 
471 As stated previously, a List of Prime Numbers is required to be
472 available for this program to work.  A suitable List does exist,
473 prepared in a format that compresses the data yet allows easy access.
474 That List holds all primes smaller than 4,294,967,296, or all primes
475 that fit in 32 bits of data.  There are 203,280,221 prime numbers on
476 that List, and they manage to fit in about 100 megabytes of space.
477 The name of this compressed-Primes file, including the directory where
478 the CRYPTION.EXE program expects to find it, is
479 "\PRIMES\COMPRESS.PRM".  The actual drive holding this directory and
480 file doesn't matter; CRYPTION.EXE will search all the computer's
481 drives (except A: and B:), looking for it (and will abort if not
482 found).  In addition to "COMPRESS.PRM", the "\PRIMES" directory must
483 also contain a file named "CMPRMDEX.QNT", which is a special index
484 allowing easy access to the Nth prime inside "COMPRESS.PRM".  The
485 CRYPTION.EXE program will also abort if it cannot find that index
486 file.
487 
488 The two files are easily and fairly quickly created by another program
489 that is called "PRMPRESS.EXE" (a 150-Mhz processor can crank out
490 COMPRESS.PRM in about 20 minutes).  PRMPRESS.EXE can in turn be
491 obtained by compiling the source file "PRMPRESS.C".  You should be
492 able to find it on the Web, perhaps at www.devsuperpage.com.  All the
493 details regarding the compressed-data-format and index files are
494 described in "PRMPRESS.C".  Also, for minor entertainment, there is a
495 program called "PIKPRIME.EXE", which utilizes both data files.  It lets
496 you enter a number  N  and then (1) tells you if that number is a
497 prime, and (2) tells you what the Nth prime is -- provided N is smaller
498 than 203,280,222.  PIKPRIME.EXE can be obtained by compiling the source
499 file "PIKPRIME.C", and it also should be findable on the Web.  The
500 reason that PIKPRIME.EXE is mentioned here is that its methods of
501 using the index file to access the Nth prime are part of the
502 CRYPTION.EXE program.
503 
504 /////////////////////////////////////////////
505 SEQUENCE OF EVENTS AS CRYPTION.EXE RUNS:
506 
507 0. Checks each drive, starting at C: for a \PRIMES directory
508    containing the COMPRESS.PRM and CMPRMDEX.QNT files.  Aborts if not
509    found.
510 1. Checks parameters.  Missing info is requested.  The user just types
511    the data and sometimes presses the ENTER key.  (Single-character
512    data items do not need ENTERing.  Simple editing of ENTERed data,
513    via Up-Arrow and BackSpace keys, is also included.)  CRYPTION
514    verifies existence of the specified Key File, and all specified
515    "Message" files.
516 2. The Key File must be at least 8K in length, but can be lots longer.
517    CRYPTION loads up to a megabyte of the Key File for a minimally-
518    random test and later use.  Invalid info leads to repeated requests
519    for input.
520 3. Loads the CMPRMDEX.QNT file and constructs a table which will be
521    used to quickly locate the Nth prime in the COMPRESS.PRM file.
522 4. Uses previously-loaded Key-File data to construct an array of
523    primes, their quantity being roughtly 1/8 the length of the Key
524    File (so a 16K Key could yield an array of 2048 primes -- except
525    that duplicates are mostly prevented.  And while the maximum-loaded
526    key-file-data can be a megabyte, no more than a 64K array of primes
527    will be created).  These primes, plucked pseudorandomly from the
528    COMPRESS.PRM file, will in turn yield pseudorandom numbers used to
529    process any/all "Message" file(s).
530 5. Opens each "Message" file, along with a new .CRP file.  Adds (or
531    doesn't add) the initial identifying phrases, and the original file
532    extension.
533 6. Processes the "Message" file.
534 7. If a "Message" file becomes unscrambled, deletes identifiers at the
535    start of the file, and restores the original file extension.
536 8. Deletes specified key and/or original "Message" file(s).  Always
537    deletes intermediate "Message" .CRP files.
538 9. Done; program automatically quits.
539 
540 COMMENTS IN THE PROGRAM CODE WILL EXPLAIN THE MAIN CRYPTION ALGORITHMS
541 USED
542 *///////////////////////////////////////////
543 
544 /*
545 THIS  C  LANGUAGE SOURCE-CODE FILE CAN BE SUCCESSFULLY COMPILED INTO A
546 WORKING WINDOWS PROGRAM BY BORLAND (Turbo) C++ 4.52, BY BORLAND C++
547 BUILDER 5, BY MICROSOFT VISUAL C++ 6, AND BY OPEN WATCOM 1.2.  Other
548 compilers for Windows programs have not been tested.  Note that the
549 first of the two Borland compilers dates from the era when Windows
550 started to replace DOS widely, and Borland was starting to drop the
551 word "Turbo" from its application-development tools.  The others are
552 thorough-going Windows compilers, suitable even for Windows 2000/XP.
553 All four are fairly easily available -- for example, the Borland 4.52
554 compiler is often given away as part of a self-teaching C programming
555 package.  (And the Open Watcom compiler is a free download, although a
556 donation is requested).
557 
558 There are no compiler-specific aspects to this source code.  As you
559 see it here, unchanged, any of those four compilers can crunch it into
560 a working program.  (Whether or not that remains true in the future,
561 after other workers have edited this code, remains to be seen.  :)
562 
563 The exact steps to compile the program differ with each compiler, but
564 are similar enough to be described as follows:  Set up a Project named
565 CRYPTION, and use the Project Options Editor/Configuration-Tool to
566 delete ALL default files, such as "CRYPTION.RES" or "CRYPTION.DEF",
567 and ensure that the ONLY Project file is this one, CRYPTION.C -- you
568 may have to specify a C Project and not a C++ Project, and if you can
569 specify an EMPTY project do so.  Ensuring that this CRYPTION.C file is
570 the sole source-code file then becomes easy.  Note that this file is
571 compatible with pre-1999 ANSI  C  standards, EXCEPT for lots of usage
572 of pairs of ordinary slash marks  //  as comment-indicators.  (In 1999
573 the double-slash comment indicator became part of the ANSI standard
574 for the  C  programming language.)
575 
576 BEWARE OF THE COMPILER/DEVELOPMENT-ENVIRONMENT REPLACING SPACES WITH
577 TABS!!!  THERE ARE NO TABS IN THIS FILE; NEARLY ALL INDENTATIONS HERE
578 ARE PAIRS OF SPACES.  The problem of the moment is that different
579 people prefer different tab-sizes, and much careful formatting at one
580 tab-size looks really ugly under almost any other.  These compilers
581 mostly offer built-in source-code-editors, along with options that
582 allow you to specify that tabs should not be used at all, and this is
583 recommended.  (Exception:  The Open Watcom compiler allows you to
584 specify your favorite text editor as part of its Integrated
585 Development Environment -- so that editor's own configuration must be
586 examined to eliminate tabs.)
587 
588 After compiling, the executable file CRYPTION.EXE should successfully
589 work as described elsewhere herein.  However, if CRYPTION.EXE is
590 copied to another computer, where it was not compiled, it may be
591 necessary to copy certain other files that come with the compiler, and
592 to keep them together.  The short list:
593   If compiled under Borland (Turbo) C++ 4.52, the file CW3215.DLL is
594     required.
595   If compiled under Borland C++ Builder 5, the number of required
596     support-files can depend on a variety of things, such as whether
597     or not the program was compiled for debugging.  You will discover
598     which ones you need as you attempt to run a copy of CRYPTION.EXE
599     on a computer that contains no compiler.  Note that Borland
600     considers those files to be "redistributables" -- they are
601     INTENDED to be copied along with .EXE files.
602   If compiled under Microsoft Visual C++ 6, no other files are
603     required.  Since this CRYPTION program was written for Windows,
604     and since Microsoft is the monopolistic provider of the Windows
605     Operating System, it figures that the Microsoft compiler has a
606     "home team advantage" over competitor compilers.
607   If compiled under Open Watcom 1.2, no other files are required.  The
608     requested donation appears to have been earned!
609 
610 Note that the CRYPTION program uses no obscure Windows functions, and
611 so is compatible with "WINE" under Linux.  Since this file is being
612 openly shared with the general public, with few restrictions, it is
613 expected that someone will quickly replace the Windows-specific code
614 with Linux-specific code, while hardly touching the program's
615 operational algorithms.  Various modifications will probably be
616 necessary if this program is used with another Operating System (for
617 example, a lot of backslash  \  symbols in Windows file names must
618 edited into regular slash  /  symbols under Linux or Unix).
619 
620 Much of the commentary in this program is intended to help budding
621   C  programmers.  Those who know this stuff can ignore it.
622 
623 NOTE TO ADVANCED WINDOWS/C PROGRAMMERS:  Yes, this is known to be a
624 combination of ugly Windows code and reasonably elegant CRYPTION code.
625 Most of the original effort was put into the most important portions,
626 and it was not desired to waste a lot of time on part of a program
627 that (A) is expected to be superceded by a CALLER program (one that
628 keeps track of all encryption keys) and/or (B) is expected to be
629 subjected to  #ifdef  or commenting by workers who prefer Linux or
630 some other Operating System.
631   What would you have done?
632 The Windows stuff merely needed to work smoothly, and indeed it nicely
633 does.  And the next-most-important thing was to swat all the bugs,
634 which didn't help the elegance of the Windows coding one bit.
635 Finally, if anyone wants to think of this as a kind of advertisement
636 of various skills, feel free to contact the original author through
637 his ISP, pinn.net.
638   --Vernon Nemitz (vnemitz)   August, 2004
639 */
640 
641 
642 
643 
644 // Header files list -- for those who don't know, these files give the
645 // compiler access to many "background" functions, variables, and
646 // constants, in a manner that is standardized, so that the "axle"
647 // doesn't have to be re-invented every time somebody writes a
648 // brand-new "wheel" program.
649 #include <windows.h>
650 #include <winbase.h>
651 #include <malloc.h>
652 #include <stdlib.h>
653 #include <stdio.h>
654 #include <string.h>
655 #include <ctype.h>
656 #include <io.h>
657 
658 
659 
660 // Error Codes: Out of Memory, Opening File, File Read, Linked List,
661 //              Defective Algorithm, Bad Data, File Write
662 #define CRP_ERR_OM 1
663 #define CRP_ERR_OF 2
664 #define CRP_ERR_FR 3
665 #define CRP_ERR_LL 4
666 #define CRP_ERR_DA 5
667 #define CRP_ERR_BD 6
668 #define CRP_ERR_FW 7
669 #define CRP_WM_QUIT 100
670 // That last one is not an error code; indicates user interrupting by
671 // pressing ESC key
672 
673 // Other definitions
674 #define PROCESSOR_INT_SIZE_32
675 #define PROCESSOR_ENDIAN_TYPE_LITTLE
676 // Definitions like these, in conjuction with #ifdef/#endif clauses,
677 // allow a program to be customized.  Blocks of code surrounded by
678 // #ifdef and #endif will ONLY be compiled if a particular "label",
679 // like PROCESSOR_INT_SIZE_32, has been defined.  These two
680 // definitions ensure that the source code in this program gets
681 // compiled to an executable that is suitable for all processors
682 // compatible with the Intel 80386, to the hyperthreaded Pentium 4.
683 // The definitions are easily changed if some other processor is the
684 // intended destination of the compiled executable.
685 
686 
687 // Preprocessor macro
688 #define uli(x) (unsigned long int)x
689 // For those inexperienced with the  C  programming language, the
690 // purpose of this is to be lazy; it takes a lot less time to type
691 // uli(variable)  than to type (unsigned long int)variable -- and part
692 // of the compiler known as the preprocessor will conveniently convert
693 // all instances of one into the other, thanks to that #define
694 // statement.  The preprocessor even generalizes the conversion of
695 // uli() so it doesn't matter what variable-name is inside the
696 // parentheses.  THEN the main compiler will use the longer expression
697 // to convert the variable's value from any other integer-type (such
698 // as signed short, or char) to unsigned long int.  Minor joke on
699 // original author:  This macro was not needed quite as often as was
700 // anticipated, at the time it was declared here.
701 
702 // And here is an alternate way to do the same thing, for all the
703 // common data types
704 #ifdef PROCESSOR_INT_SIZE_32
705 typedef signed char        S_08; // signed, 8 bits
706 typedef signed short int   S_16; // signed, 16 bits
707 typedef signed long int    S_32; // signed, 32 bits
708 typedef unsigned char      U_08; // unsigned, 8 bits
709 typedef unsigned short int U_16; // unsigned, 16 bits
710 typedef unsigned long int  U_32; // unsigned, 32 bits
711 #endif
712 // NOW THERE IS NO POSSIBLE CONFUSION
713 // With respect to type-conversions, usually called "casting", the way
714 // to do it is, for example, like this:  (U_32)variable  -- which you
715 // can see is equivalent to the  (unsigned long int)variable
716 // mentioned in the previous paragraph.  The obvious syntactic
717 // difference between  (U_32)variable  and  uli(variable)  means that
718 // the compiler never gets confused -- but both ways are now easy to
719 // type.
720 
721 /*        SPECIAL NOTE REGARDING VARIABLE SIZES:
722     In the  C  language, the  int  type is supposed to be the
723 "natural" size for the processor.  That is, a compiler for a 32-bit
724 / processor will be designed so that simple  int  variables are 32-bit
725 variables.  In prior years, when 16-bit processors ruled,  C
726 compilers were made so easy-to-write  int  variables were 16-bit
727 variables.  This is actually the primary reason for creating the
728 typedefs above, to remove any uncertainty about sizes of declared
729 variables.  HOWEVER, BEWARE!  If you use the above typedefs on a
730 16-bit compiler, the  S_16  and U_16  types will actually be only 8
731 bits wide!  And the  S_32  and  U_32  types will actually be only 16
732 bits wide.  Fortunately, such compilers have mostly fallen by the
733 wayside over the years.  All four of the compilers specified
734 previously in this file will handle these typedefs as is desired and
735 expected.
736     One aspect of the preceding is that there are concerns about the
737 future.  Already the phrase "long long int" is sometimes being used to
738 specify 64-bit data.  The  C  language does not really have a natural
739 way to accommodate ever-increasing data sizes, as the decades go by.
740 PERHAPS something like this will be agreed-upon, someday:  (1) Let
741 int  remain the natural data size for a processor -- and let the
742 compiler's documentation make it VERY clear what that size is!
743 (2) Let every usage of  long  specify a doubling of that size.
744 (3) Let every usage of  short  specify a halving of that size.
745 Yes, obviously the preceding typedefs break rule (2) because the
746 specified compilers for this program set the default size of an  int
747 to 32 bits, so "long int" above should not be needed, unless
748 specifying S_64 or U_64 data.  However, that was done mostly for the
749 benefit of the older Borland compiler, which evolved from a 16-bit
750 compiler, and has a chance of being confused about whether or not a
751 particular  int  is 32 bits.  Using  long  guarantees that the old
752 compiler will never be confused (and the other compilers don't have
753 those suggested Rules incorporated, so no problem).  But in the
754 future, should those suggested Rules be accepted and implemented, then
755 all that a program like this would need, if compiled for, say, a
756 128-bit processor which had some 256-bit abilities, is something like
757 this:
758 
759 #ifdef PROCESSOR_INT_SIZE_128
760 typedef signed char                    S_08; // signed, 8 bits
761 typedef signed short short short int   S_16; // signed, 16 bits
762 typedef signed short short int         S_32; // signed, 32 bits
763 typedef signed short int               S_64; // signed, 64 bits
764 typedef signed int                     S128; // signed, 128 bits
765 typedef signed long int                S256; // signed, 256 bits
766 typedef unsigned char                  U_08; // unsigned, 8 bits
767 typedef unsigned short short short int U_16; // unsigned, 16 bits
768 typedef unsigned short short int       U_32; // unsigned, 32 bits
769 typedef unsigned short int             U_64; // unsigned, 64 bits
770 typedef unsigned int                   U128; // unsigned, 128 bits
771 typedef unsigned long int              S256; // unsigned, 256 bits
772 #endif
773 
774 --AND, of course, that  #define PROCESSOR_INT_SIZE_32  above would
775 have to be changed.  Yet that would be the ONLY change, in this entire
776 program, for that other processor (Little-Endian/Big-Endian
777 controversy excepted, but that's a long comment for another place).
778 */
779 
780 // SPECIAL NOTE REGARDING "signed" DATA:  The key fact to ALWAYS
781 // remember, with respect to computer programs and computer
782 // programming, is that all data is represented by numbers, and
783 // THEREFORE there are many ways in which numbers can be used to
784 // represent data.  It is extremely important that Represenation
785 // #1 be consistently used, and not be confused with any other
786 // Representation.  In the particular case of "signed" data, it is
787 // first Decided that one single bit will represent a negative value
788 // -- but only if that bit is set.  For any "unsigned" data, it is
789 // Decided that that same bit will have no negative effect, even if
790 // set.  More specifically, consider the 8 bits inside a single byte
791 // of data:  Each bit has a particular numerical value if set, and
792 // these unsigned values are:  1, 2, 4, 8, 16, 32, 64, and 128.  Note
793 // that they can be added in any combination to yield results from 1
794 // to 255 (as well as Zero if none of the bits are set).  If we Decide
795 // that we want that byte to be able to accommodate negative numbers,
796 // then this "signed" byte has these bit-values:  1, 2, 4, 8, 16, 32,
797 // 64, and -128.  Now they can be added in any combination to yield a
798 // range of values from -128 to +127 (note that this is still a range
799 // of 256 different values).  The Rule for signed data is very simple
800 // for all data-sizes, and is always the same:  The largest-magnitude
801 // bit is negative if set.  All the other bits are always positive.
802 // So, for signed 16-bit (S_16) data, the biggest bit has the value of
803 // -32,768; for S_32 data, the biggest bit has the value of
804 // -2,147,483,648, and so on.
805 
806 
807 
808 
809 // Prepare a data structure suitable for many repeated
810 // parameter-groups
811 struct affect
812 { S_32  skipno;       // Number of times to waste pseudorandom
813                       //  calculations
814   char algo;          // algorithm code E (encrypt) or D (decrypt)
815   char pathname[50];  // limitation due to chosen window display
816   S_16  dotloc;       // For location of period preceding filename
817                       //  extension
818   U_32 length;        // length of file-to-affect
819   char kill;          // Y (yes) or N (no)
820   char newname[50];   // Because files are renamed and not always
821                       //  deleted
822   struct affect *prev, *next; // Ensure link-pointers set NULL after
823                               //  malloc()
824 }; // A linked list of these will accommodate many data files to
825    //  process
826 // And what is a "linked list"?  It is a thing that depends on the
827 // fact that a variable can hold a number which is the memory-location
828 // (or "address") of some other variable.  Such address-holding
829 // variables are called "pointers", and the address of a structured
830 // grouping of variables is no different from the address of a single
831 // variable (both are just numbers).  So, inside each instance of this
832 // structure called "affect", there is a variable called "next" which
833 // can hold the address of another "affect" structure.  A simple rule
834 // to follow is that if "next" does NOT hold a known-to-be-valid
835 // address, then it should hold the value of NULL (zero).  That way
836 // one can follow the "next" pointer-links through every item on the
837 // list of mastdir structures, and know when the end has been reached.
838 // The comment above, regarding "malloc()", refers to the usage of
839 // that function to allocate memory to hold an "affect" structure.
840 // Such a just-allocated structure should be seen as LAST on a linked
841 // list, and so it is generally logical that its "next" pointer be set
842 // to NULL.  When ANOTHER "affect" structure gets some just-allocated
843 // memory, then the earlier structure's "next" pointer can be given
844 // the address of the new structure, while the new one's "next"
845 // pointer is set to NULL.  Naturally, the "prev" pointers work the
846 // same general way, only backward, each pointing at the previous
847 // "affect" structure in the linked list.  The FIRST structure's
848 // "prev" pointer must be NULL, is all....
849 
850 
851 
852 // prepare a data structure suitable for division calculations
853 struct scratchpad
854 { U_32 dvd; // divided (becomes remainder, which is divided again...)
855   U_32 dvs; // divisor
856 }; // For lots and lots of division -- thousands of these are expected
857    //  to get used
858 // Note that the preceding structure could be suitable for holding
859 // 64-bit numbers.  Some 32-bit compilers include a similar structure,
860 // because there are occasions when 64-bit numbers are greatly needed.
861 // Using this  scratchpad  structure will be somewhat unconventional
862 // though, because the two variables are not named "high" and "low".
863 // But it could still WORK, thanks to the availability of dynamic
864 // data-type casting.  Not to mention that some 32-bit compilers don't
865 // have any equivalent.  And, below, a specific reason for doing this
866 // is presented.
867 
868 
869 // Although included in winbase.h, the Borland C++ Builder 5 compiler
870 // somehow fails to fully recognize the WIN32_FIND_DATA structure.  It
871 // may be just a bug for which a patch is available.  A work-around is
872 // this renamed clone:
873 struct W_F_D    // Since this overall W_F_D structure will be typecast
874 { U_32              Attribs;    // as a WIN_32_FIND_DATA when actually
875   struct scratchpad CreateTime;   // used, the internal data types and
876   struct scratchpad LastAccessed;    // names are irrelevant.  Windows
877   struct scratchpad LastWritten;// just wants the correct total number
878   U_32              FileSzHi;                             // of bytes.
879   U_32              FileSzLo;     // This is the only varible WE need.
880   U_32              Rsrvd0;
881   U_32              Rsrvd1;
882   FAR char          FilNam[MAX_PATH];
883   FAR char          AltNam[14];
884 };
885 // Fortunately, no other work-arounds were needed, to get four
886 // different compilers to successfully process this file, with no
887 // editing needed by the user.  That is, even though the other three
888 // compilers don't need this structure, they can still accept it and
889 // use it and the result works fine.  On the other hand, this is to
890 // be expected when staying within the bounds of the ANSI standards
891 // for the  C  language.



			
Vote: How useful do you find this Article/Tip?
Bad Excellent
1 2 3 4 5 6 7 8 9 10

 

Advertisement
Share this page
Advertisement
Download from Google

Copyright © Mendozi Enterprises LLC