How to read an entire file into a string variable?
Blitz3D Forums/Blitz3D Beginners Area/How to read an entire file into a string variable?
| ||
Hi, I'm just beginning to code with Blitz+ and, as my first project, I would like to open a MS Publisher file and extract some text out of it. Since I don't have the details of the .PUB file format, I plan to search within the file for a certain string which is always at the beginning of the text section I want to grab. For this, the INSTR command would be great, but: 1-How to put the entire file into a string variable without reading it byte by byte, which would be too slow 2- Is a 200K binary file too big to fit in a string? Thnx in advance |
| ||
You'd be better off loading the file into a bank. |
| ||
Instead of reading the entire file into a single string you should read the file in a line at a time, using the same string variable every time, and check that line until you encounter the text you are looking for. Or use a bank like GFK suggests. I should really learn how to use banks for data; I haven't done much file i/o other than ReadLine/WriteLine. |
| ||
Is there a way to do a fast search on the content of a bank to retrieve a specific byte sequence? It needs to be fast since I have hundreds of these files, they are all about 180-200K in size. Only one file will be loaded at a time. That's why I was counting on INSTR to do the search. But, since the bank is in memory, that may be fast enough. I'll go do some more tests. |
| ||
Try this:Bank = LoadFileToBank("Filename.txt") Result = FindTextInBank("some text",Bank) If Result = 0 Notify "String not found!" Else Notify "Result found at offset " + Result EndIf Function LoadFileToBank(Filename$) FILE = ReadFile(Filename) BankID = CreateBank(FileSize(Filename)) ReadBytes BankID,FILE,0,BankSize(BankID) CloseFile FILE Return BankID End Function Function FindTextInBank(Txt$,BankID,Start = 0) Ptr = Start Repeat T$ = Chr$(PeekByte(BankID,Ptr)) If T$ = Left$(Txt$,1) For N = 1 To Len(Txt$)-1 T$ = T$ + Chr$(PeekByte(BankID,Ptr+N)) If T$ = Txt$ Then Exit Next EndIf Ptr = Ptr + 1 Until Ptr >= BankSize(BankID) Or T$ = Txt$ If T$ = Txt$ Return Ptr Else Return False EndIf End FunctionIt works. Don't know how fast it is with big files though... [EDIT] Just did a test on a 170k file - took ~4s to find a text string right near the end of the file. I haven't really used banks before - there's probably a way of optimising this so that it works in a fraction of the time it currently takes. [EDIT 2] Try it with debug off! Same test as above takes about 0.3s. :)) |
| ||
Here's a trick that just occurred to me. WriteString writes an integer, the length of the string, followed by the string data. ReadString reads this back into a Blitz string. So you could use Blitz to build a new file consisting of an integer ( the length of the original file ) followed by the original file. Then use ReadString on this new file. |
| ||
Wow! Plenty of good ideas for me to work on... At 0.3s, it will be more than fast enough. I may also try Floyd's idea in a slightly modified way: to overwrite the first 4 bytes in the file to make an integer that will be equal the length of the remaining bytes of the files and then read it as a string. That should work as I don't need those first 4 bytes anyway. I should work on a copy of a the file, though, which may slows down things a bit. Thanks alot, guys. |
| ||
You guys should check the codearchives... http://www.blitzbasic.com/codearcs/codearcs.php?code=685 and http://www.blitzbasic.com/codearcs/codearcs.php?code=687 |