Local variables that depend on transformed values seem to accumulate memory from the start of the file. #113

paxcut · 2024-06-30T01:44:53Z

My use case was to calculate crc32 checksums of strings that had to be converted to lowercase before calculating the hash. First define an string, in this case null terminated (I choose a while array to avoid errors of data no placed) . Then we embed the string in a container for the sole purpose of using transform. After defining the transform function the possibly upper-cased strings are read from the file. We print one to verify it works. the crc32 checksum is computed to a local variable which is formatted and exported.

import std.string;
import std.core;
import std.hash;

u32 numFiles=1000;

struct Name {
    char name[while($[$]!=0)];
    padding[1];
}[[inline]];


struct Lower {
    Name name [[transform("lower")]];
}[[inline]];


fn lower(auto file) {
    Name result;
    result.name=std::string::to_lower(file.name);
    return result;
};

Name test @0x285679; //0x1126b
Lower testLower @0x285679; //0x1126b
std::print(" Uppercase: {} \n Lowercase: {}",test,testLower.name);


struct CRCLookup { 
    Lower fileName;
    u32 strCrc = std::hash::crc32(fileName.name, -1, 0x04C11DB7, -1, true, true) [[export,format("format_crc32")]];
};

fn format_crc32(auto val){
    return std::format("{:#x}",val);
};

CRCLookup names[numFiles]  @ 0x285679; //0x1126b

Input file: testMemory.zip

So far so good.I am running the same code shown here in an input file that has twp identical sets of strings one located close to he beginning of the file and the other at the end (addresses are on the code above). Even tough we skip reading the preamble data and both sets of pattern should be the same size, the resulting patterns differ greatly in size. In fact, depending on how much memory you have you should limit the number of strings being read or else your computer may hang.

Not only the patterns in the second case consume much more memory but also the size increments grow bigger and bigger as more strings are being read. Both observations suggest that the patterns contain data from the start of the file or some fixed position.

paxcut · 2024-06-30T06:02:13Z

Update: added the input file which I forgot for some reason.

The test the bug.

load the Input file, Unzip it first and dont worry, it is just libihmhex.dll.a with the symbol table duplicated at its end.
Copy paste the pattern listed above.
Run the pattern and note the memory usage
change the @ values for the commented ones and run it again.
Using 1000 strings I see a huge difference from 100 Mibs for the low address to 2.66 Gibs for the high address. Bigger number of strings give even bigger differences.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local variables that depend on transformed values seem to accumulate memory from the start of the file. #113

Local variables that depend on transformed values seem to accumulate memory from the start of the file. #113

paxcut commented Jun 30, 2024 •

edited

Loading

paxcut commented Jun 30, 2024 •

edited

Loading

Local variables that depend on transformed values seem to accumulate memory from the start of the file. #113

Local variables that depend on transformed values seem to accumulate memory from the start of the file. #113

Comments

paxcut commented Jun 30, 2024 • edited Loading

paxcut commented Jun 30, 2024 • edited Loading

paxcut commented Jun 30, 2024 •

edited

Loading

paxcut commented Jun 30, 2024 •

edited

Loading