Archicad C++ API
About Archicad add-on development using the C++ API.

[newbie] opening a textfile and reading it line by line

Anonymous
Not applicable
I tried to use the example from the documentation:
#include "DG.h"    // brings file selection dialog (and also includes Location)

IO::Location loc;

if (!DGGetOpenFile (&loc)) {    // returns the selected location in loc
    // no file (folder, link) location was selected
}
but I just couldn't get it to work. I even searched through all the documentation without finding any other reference to DGGetOpenFile. So I gave it up.

Then I tried:
	IO::Location fileLoc;    // Location instance
	IO::File file (fileLoc);
	char buffer[128];

DG::FileDialog dlg (DG::FileDialog::OpenMultiFile);

    if (!dlg.Invoke ())
        return false;

    int count = dlg.GetSelectionCount ();

    for (int n = 0; n < count; n++) {
        IO::File file (dlg.GetSelectedFile (n));
	 errorCode = file.Open (IO::File::ReadMode);    // opening the file in read-only mode
	 errorCode = file.ReadBin(buffer, 128);

 DGAlert(DG_INFORMATION, "Inside while", buffer, 0, "OK", "Cancel", 0);
       
    }

This lets me open my file from a filedialog. One problem though, is that I can't open it as myFile.sos, which is what it is, I have to rename it to myFile.txt. Why is that, and what can I do with it?

The other thing is that I can't read one line at the time. How do I do that?
Yet another thing is that, if I can't read one line at the time, how do I read the whole file at once. And then the stringmanager doesn't seem too equiped whith functions to parse the text and split it up in sutable chunks. Can I use standard c++ functions and includes as well as the ones that comes with the API?
--
Regards,
Tor Jørgen
15 REPLIES 15
stefan
Advisor
Haven't tried the DG yet, but I can write to filestreams as in regular C/C++
FILE *stream;
if( (stream = fopen("test.dat", "w"))!=NULL)
{
	fprintf( stream, "blablah %.3f \n", 27.0);
}
fclose(stream);
_fcloseall();
_flushall();
I guess, for ArchiCAD conformance, you should try to use the API-tools instead. Don't know how portable this is and you probably want to use the sytem FileOpen & Save-dialogs as well...
--- stefan boeykens --- bim-expert-architect-engineer-musician ---
Archicad28/Revit2024/Rhino8/Solibri/Zoom
MBP2023:14"M2MAX/Sequoia+Win11
Archicad-user since 1998
my Archicad Book
Ralph Wessel
Mentor
Tor wrote:
IO::Location loc; 
 
if (!DGGetOpenFile (&loc)) {    // returns the selected location in loc
    // no file (folder, link) location was selected
}
but I just couldn't get it to work. I even searched through all the documentation without finding any other reference to DGGetOpenFile.
I haven't found any reference to this either, but it does work with a slight modification. Without specific documentation, you can often find the information you need by searching through the Support files, especially the Include files. For example, the declaration of DGGetOpenFile is:
DG_DLL_EXPORT bool CCALL	DGGetOpenFile (CIO::Location* retLocation,
										   long popupItemCount = 0, DGTypePopupItem* popupItems = NULL,
										   const CIO::Location* defLocation = NULL,
										   const char* title = NULL, long flags = 0);

DG_DLL_EXPORT long CCALL	DGGetOpenFile (CIO::Location** retLocationArray,
										   long popupItemCount = 0, DGTypePopupItem* popupItems = NULL,
										   const CIO::Location* defLocation = NULL,
										   const char* title = NULL, long flags = 0);
This shows us there are two variants of this function: one which allows a single file to be selected, and another for multiple files. You probably want the first. This expects a pointer to a CIO::Location (try this instead of IO::Location).

Next, you can optionally specify file filters for the format to be opened with popupItemCount and popupItems. 'popupItemCount' specifies the number of filters, and the filter specifications are in 'popupItems' (an array with popupItemCount items). Each item in popupItems is of type 'DGTypePopupItem', which can also be found in the Include files:
struct DGTypePopupItem {
	const char*	text;
	const char*	extensions;
	long		macType;
};
Briefly, you specify a name for the filter, the name extension, and a Macintosh file type descriptor (4 chars).

'defLocation' allows you to specify a default starting point (not usually a good idea), and 'title' is the dialog title. 'flags' allows you to specify some of the file browser behaviour and appearance. Use the following:
// --- Constants for DGGetOpenFile ---------------------------------------------

#define DG_OF_NO_ALL_FILES				0x0001
#define DG_OF_NO_ROOT_GROUP				0x0002
#define DG_OF_GROUPS_FIRST				0x0004
#define DG_OF_DISPLAY_EXTENSIONS		0x0008
#define DG_OF_DONT_DISPLAY_EXTENSIONS	0x0010
#define DG_OF_LIST_SINGLE_CHILD_GROUPS	0x0020
So, if you wanted the user to open a single plain text file, and you didn't want to see all file types, you could write:
	CIO::Location loc;
	DGTypePopupItem	popup[1] = { {"Plain Text", "txt", 'TEXT'} };
	if (!DGGetOpenFile (&loc, 1, popup, 0, "Test", DG_OF_NO_ALL_FILES)) {
	    //No file selected
	}
Tor wrote:
The other thing is that I can't read one line at the time. How do I do that? ... if I can't read one line at the time, how do I read the whole file at once. And then the stringmanager doesn't seem too equiped whith functions to parse the text and split it up in sutable chunks. Can I use standard c++ functions and includes as well as the ones that comes with the API?
You can use standard the C/C++ library (or others) for strings and i/o, but there are some issues to consider. Are you sure the file content is always text? And, more importantly, what is the encoding of the text, e.g. UTF8? You need to keep in mind that a single character is not necessarily a single byte.

The standard C/C++ libraries largely work cross-platform. I recommend the C++ libraries over C, and it is well worth understanding the STL. The containers in the STL (vector, list, map etc) are particularly valuable.

The ArchiCAD API can help with international text (to some extent), and it is best to try to use their string functions when interacting with ArchiCAD. Buffering and parsing text is a fairly big subject though. When you say you want to read a line, do you mean a line terminated by some combination of carriage return and linefeed, or a logical line in the syntax of the file format you are parsing? In any case, you need to parse the text - accumulating everything you read to a buffer - until you discover the conditions which terminate the line (or the file).

If you are using the ArchiCAD API File class, you can obtain the file size with the GetDataLength method. You could then allocate a memory block large enough to accomodate the entire file and read it in one hit. You need to consider what size these files may be if you take this approach - buffering the contents will likely be far more efficient.
Ralph Wessel BArch
Software Engineer Speckle Systems
Anonymous
Not applicable
Thanks again for a very thorough answer, Ralph 🙂
and thanks for the tip about using filestreams as in regular C/C++, Stefan.

Ralph wrote:
When you say you want to read a line, do you mean a line terminated by some combination of carriage return and linefeed, or a logical line in the syntax of the file format you are parsing? In any case, you need to parse the text - accumulating everything you read to a buffer - until you discover the conditions which terminate the line (or the file).
I mean a line terminated by some combination of carriage return and linefeed.

When I start reading the first part of the first line, which goes like this:

.HODE 0:

and writes it to the alert or the reportwindow, what I get is this:

.??†
H??†
O??†
D??†
E??†
??†
0??†
?†


I've copied this straight from the reportwindow. When I look at the preview I see that the last line which is a colon followed by two questionmarks is represented here by a confused smiley followed by one questionmark. (the smiley is kind of correct, though...)
You could then allocate a memory block large enough to accomodate the entire file and read it in one hit. You need to consider what size these files may be if you take this approach - buffering the contents will likely be far more efficient.

I've tried to use
errorCode = file.GetDataLength(&fLength);
but I'm not allowed to use the fLength-value to set the length of my charbuffer.

And I still can't make the DGGetOpenFile() work. I've included DG.h, File.hpp and Filesystem.hpp, do I need more?

--
Regards,
Tor Jørgen
Ralph Wessel
Mentor
Tor wrote:
Ralph wrote:
When you say you want to read a line, do you mean... etc
I mean a line terminated by some combination of carriage return and linefeed.
OK. Standard line endings are:
  • Mac: carriage-return "\r"
    Win: carriage-return + linefeed "\r\n"
    Unix: linefeed "\n"
If you simply wanted to find where a line ended in an array of characters, you could use something like the 'find_first_of' method in the STL std::string class. I've implemented my own string class based on the String Manager functions to provide this functionality. However, note my comments on character width and encoding below.
Tor wrote:
When I start reading the first part of the first line, which goes like this:
.HODE 0:
and writes it to the alert or the reportwindow, what I get is this:
.??†
etc...
As I mentioned above, your text may well be encoded. Basically, the old ASCII codes allowed for only 256 character symbols at most (1 byte per character), but many languages have vastly more symbols than that. Therefore, text needs to be in a form which can account for thousands of character symbols and also provide for commonality between them, e.g. common numeric symbols. This is a fairly deep subject, but the String Manager functions in the ArchiCAD API may be able to provide what you need.

The problem is, I don't know much about the file format you are dealing with. Standard XML, for example, declares the encoding type so you know how to interpret it. Does the documentation for this format discuss encoding?

Your encoding could be UTF-32 (4 bytes per character) because every character in your file is displayed with 4 symbols, and the leading character is the ASCII value. Read the String Manager documentation, and in particular look at 'GSCharCode' and 'CHSetDefaultCharCode'.
Tor wrote:
You could then allocate a memory block large enough to accomodate the entire file...

I've tried to use
errorCode = file.GetDataLength(&fLength);
but I'm not allowed to use the fLength-value to set the length of my charbuffer.
I don't know what you mean by, "not allowed to use fLength" - it's just a number which can be used to allocate a block of memory. Are you doing something like this:
		USize fLength = 0;
		file.GetDataLength(&fLength);
		GSPtr buff = BMAllocatePtr(fLength, ALLOCATE_CLEAR, 0);
You can then read the file data into 'buff'.
Tor wrote:
And I still can't make the DGGetOpenFile() work. I've included DG.h, File.hpp and Filesystem.hpp, do I need more?
Did you copy/paste my example into your file?
   CIO::Location loc; 
   DGTypePopupItem   popup[1] = { {"Plain Text", "txt", 'TEXT'} }; 
   if (!DGGetOpenFile (&loc, 1, popup, 0, "Test", DG_OF_NO_ALL_FILES)) { 
       //No file selected 
   }
Does this work?
Ralph Wessel BArch
Software Engineer Speckle Systems
Anonymous
Not applicable
Thanks again,
I tried copying your code into my project, Ralph, and after adding some more includes it worked 🙂

I still can't get my debugging to work, so as a workaround I'm looking for a way to write numbers to the alert or the reportwindow. Is there a way to do this? I guess I have to do some sort of casting, but I just can't figure out how...

I've done most of my programming in flash before - except for some openGL code in C++ during my courses at the university - and it's really dawning on me that flash is kind of like playing in a sheltered kindergarten with the macromedia guys acting as the grownups making sure the kids don't hurt themselves...

--
Regards,
Tor Jørgen
Anonymous
Not applicable
I found out that I can use ultoa, which made me realise - when I saw the numbers - that I'm not quite into this stuff yet...

--
Regards,
Tor Jørgen
Anonymous
Not applicable
I'm getting a little lost in all those ptrs and handles...
so I'll just keep posting these rookie-questions

I'm reading the file into a buffer like Ralph suggested
		GSPtr buff = BMAllocatePtr(fLength, ALLOCATE_CLEAR, 0); 
		errorCode = file.ReadBin(buff, fLength); 

Then I'm searching for the first occurence of "\r" like this
		GSPtr linePtr = CHSearchSubstring("\r", buff, fLength);	// pointing to the first cr 

Then I'm left with a ptr to the beginning of the text in buff and a ptr to the first cr in linePtr.
The length of buff is 22261 and the length of linePtr is 0, which is not quite what I wanted - but I guess it makes sense since it just points to the position of cr.

My question then is what kind of math do I have to perform to be able to copy the just the first line into another charbuffer?

--
Regards,
Tor Jørgen
Oleg
Expert
As Ralph wrote, perhaps your file has the UTF-32 encoding.
It seems, ArchiCAD API has no support of this encoding.
I think, it will be easier for you if you will convert your file to ASCII ( or Unicode at last, if ASCII will not suitable) by some external software.
Actually, you can not search ASCII substring like "\r" in a differently encoded string.
Anonymous
Not applicable
I don't now what kind of encoding it has, but one of the first lines in the file says characterset DOSN8, which I'm guessing is some sort of DOS textfile with norwegian characters in it - in other words it should include Æ Ø Å, but I'm getting stuff like ù è when I view it in TextEdit here on my Mac.

When I read it into buff using the code I showed in my last post, and then write it to an alert or the reportwindow, the output look exactly like the file when I view it in TextEdit. From that I'm thinking that the API can handle the file ok.
And if I search for "\r" and then write the new ptr to the alert or reportwindow, it shows the file starting at line 2.

I think my problem is more basic, like how do I use those ptrs and stuff to find out how many bytes I have to read or copy to get my hands on the next line. Maybe it so basic you can't even see the problem I'm having...

--
Regards,
Tor Jørgen