Value not being found in txt file but it is there

I am parsing a genealogy test file to extract some data from it to use elsewhere.

This is currently the code that I have and it is getting the name and the reference fine but not getting the sex of the person.

$file_handle = fopen($fn, "r");     
while(!feof($file_handle)){       $line = fgets($file_handle);       
if ((strpos($line, "0 @") !== false) && (strpos($line,"INDI")!==false)){           
$ref = explode("@",$line);           
$line = fgets($file_handle);           
if (strpos($line,"1 NAME") !==false){$name = str_replace("1 NAME ","",$line);$na = explode("/",$name);} 

if (strpos($line,"1 SEX") !==false){$sx= str_replace("1 SEX ","",$line);} 
echo $ref[1]." ".$na[0]." ".$na[1]." ".$sx."<br>";       
}   
} 
} 
fclose;

This is an example of a record from the file

0 @I3@ INDI
1 NAME Reginald William /Spencer/
2 GIVN Reginald William
2 SURN Spencer
1 SEX M

Can anyone please give me a clue as to why the Sex is not being read?

Well, what is this code supposed to do? Remove the extra " and carriage return…

To clean up the formatting…

$file_handle = fopen($fn, "r");     
while(!feof($file_handle)){       
	$line = fgets($file_handle);       
	if ((strpos($line, "0 @") !== false) && (strpos($line,"INDI")!==false)){           
		$ref = explode("@",$line);           
		$line = fgets($file_handle);           
		if (strpos($line,"1 NAME") !==false){
			$name = str_replace("1 NAME ","",$line);$na = explode("/",$name);
		} 

		if (strpos($line,"1 SEX") !==false){
		   $sx= str_replace("1 SEX ","",$line);
		} 
		echo $ref[1]." ".$na[0]." ".$na[1]." ".$sx."<br>";       
	}   
} 
fclose;

An easier way to handle this in my mind.

$file_handle = fopen($fn, "r");     
while(!feof($file_handle)){     
    $origin = '';
    $name[0] = '';
    $name[1] = '';    
    $sex = '';
  
    $line = fgets($file_handle);
    // get the origin
    if(preg_match('/@\s(.*)$/', $line, $output))
        $origin = $output[1];

    // get the name
    if(preg_match('/NAME(.*)\$/', $line, $output)) {
        $name = explode("/",$output);
    } 
     
    // get the sex
    if (!preg_match('/SEX(.*)$/', $line, $sex))
        echo "Not sex found.";

    echo "{$origin} {$name[0]} {$name[1]} {$sex}<br>";       

} 
fclose;

Thanks for tidying up the code, I couldn’t find how to put it in code tags on this new forum.

@astonecipher Unfortunately your suggested code has more errors than mine missing {} for instance.
@ErnieAlex now that the code is reformatted you can see that there is code there.

I need to do some more work to sort this I will keep plugging away. Any more bright ideas most welcome.

I fixed the error on the if statement. Not sure what other errors there are, since I can’t actually run it to see what happens.

The way to format the code in the new forum is three back ticks
```

What missing brackets? They aren’t required for a block, but the caveat is, it will only work for one liners. (If you are referring to the “get origin” section"?

I use PHPStorm as my IDE and it showed up with lots of errors in your code as did my dev site when I tried to run it after ignoring the warnings in PHPStorm. Shouldn’t all the If statements should have the results in {}?

Hmmm, mind sharing those errors? I’d like to look at them and see what my brain missed!

if(1==1)
   echo 1;

is the same as

if(1==1) {
  echo 1;
}

Where it fails in when you get more complex, depending

$a = 1;
$b = 2;
if($a == 2)
    echo $a;
    $b -= 1;

echo $b; // will be 1 because it is always changed.

It works the same in all code blocks

for($a = 10; $a != 0; $a--)
    if($a == 5) {
        break;
    } else {
        echo "$a<br>";
    }

The “curly brackets” are called braces. At least here in the US. The braces are not needed in an IF clause if there is only one statement resulting on the true condition.

In some IDE’s they show them up to make the programmer adhere to strict programming rules. Some IDE’s require braces to be used all the time. Usually in a good IDE, there is a way to shut them off.

Thanks for those tips about the curly brackets I need to check out my IDE because it shows the non-use of them as errors. I am still a relative novice at php.

Unfortunately I had closed down the IDE so cannot share the errors shown as they are lost on close down.

I am using this code on a custom page template within Wordpress and one of the errors it showed was something about the declaration of the $object variable. Maybe $object is a reserved variable in Wordpress.

I think that I should have explained my initial problem better. The SEX line in the file will always contain a value so there is no need to test for it not being set.

The flat file database that I am trying to parse will be in the form of:

(many lines not interested in)
0 @I1484@ INDI
1 NAME Susan /Lefevre/
2 GIVN Susan
2 SURN Lefevre
1 SEX F
….
0 @I1485@ INDI
1 NAME Frederick /Lefevre/
2 GIVN Frederick
2 SURN Lefevre
1 SEX M
….
0 @I1486@ INDI
1 NAME Albert Edward /Lefevre/
2 GIVN Albert Edward
2 SURN Lefevre
1 SEX M
….

So each new record starts with a 0. What I actually need to do is to put this data and some of the …. into a MySQL table and then display it as appropriate on a Wordpress page. This was the reason for my coding initially pulling the data between the 0 into memory before writing to the MySQL table.

Any more clues / hints most welcome.

How large is the file?

I would read in to the delimiter and then parse the rows. Depending on how big the file is, you could put it all in memory.

Well, In that case, I would parse it differently. Loop thru the file and if it hits a zero, parse the next four items inside that loop. If you are 100% sure of the format you can lock in the was it is . You just loop thru the text and if it is 1-SEX, you can know you got all the info and can display it. Going back to your original code, here is a slightly different version to test.

$file_handle = fopen($fn, "r");
while(!feof($file_handle)){
    $line = fgets($file_handle);
    if ((strpos($line, "0 @") !== false) && (strpos($line,"INDI")!==false)) {
        $ref = explode("@",$line);
        $line = fgets($file_handle);
    } elseif (strpos($line,"1 NAME") !==false) {
        $name = str_replace("1 NAME ","",$line);$na = explode("/",$name);
    } elseif (strpos($line,"1 SEX") !==false) {
        $sx= str_replace("1 SEX ","",$line);
        echo $ref[1]." ".$na[0]." ".$na[1]." ".$sx."<br>";       
    }
}
fclose;

Is your data for a single record actually stored in a single column as you have posted? Do you have any control over the format of how the data file is written? How is the data generated/where does it come from?

The data is a fixed format (gedcom) that I cannot alter.

The data file can be any size from a few k right up to in excess of 10meg but likely to be of the order of 5-8meg. So could easily be 200,000 lines of data in the file. I suspect that this is too much to read into an array.

Unfortunately, whilst it is a standard, everybody implements it slightly differently and puts the data in a different order, the only thing that I can guaranty is that each individuals record will start with 0 @ and all the data for that individual will continue as a variable number of data rows until the next individual’s record starts with 0 @.

For example this is a single record from a different source file and the data is in a totally different order:

0 @P140@ INDI
1 DEAT
2 DATE 22 Nov 1862
2 PLAC Nottingham, Nottinghamshire, England
1 BIRT
2 DATE 1841
2 PLAC Nottingham, Nottinghamshire, England
1 NAME John /Proctor/
1 SEX M

Thanks for all the replies so far.

GEDCOM was a rather important piece of info. There are already a number of GEDCOM parsers available. Do you really need to roll your own?

Here is a list of many of them.
https://www.tamurajones.net/OpenSourceGEDCOMParsers.xhtml

Ideally not BUT nearly all the PHP ones on that list are either no longer available or no longer supported. Plus I want to pull this into a Wordpress website and the current Wordpress plugins are poor. I found one Genealone but that is no longer supported either and it doesn’t work with Wordpress v5.x.

Did you try the test of the reformatted version I posted?
Your old version did not us multiple if-else’s and I think this one should work better for you.

Unfortunately it isn’t working, it only gets the reference and sex there is no name in the output. I will play with the code today and see where I can get to (I am in a different time zone to you).

@ErnieAlex I found the bug in your code the first elseif should be if as it comes after another file read. Works fine now.

Sorry, typed that very fast yesterday! Glad it is working for you now!

1 Like
Sponsor our Newsletter | Privacy Policy | Terms of Service