Segmentation Faults

Frequently Asked Questions about ROMS usage

Moderators: arango, kate, robertson

Post Reply
Message
Author
User avatar
kate
Posts: 3995
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Segmentation Faults

#1 Post by kate »

Just had a segmentation fault that I can't figure out at all:

Code: Select all

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
oceanG             00000000035D46E5  Unknown               Unknown  Unknown
oceanG             00000000035D2307  Unknown               Unknown  Unknown
oceanG             000000000357EA64  Unknown               Unknown  Unknown
oceanG             000000000357E876  Unknown               Unknown  Unknown
oceanG             0000000003531296  Unknown               Unknown  Unknown
oceanG             0000000003534E90  Unknown               Unknown  Unknown
libpthread.so.0    00007F62B67E87E0  Unknown               Unknown  Unknown
oceanG             00000000035082A5  nf_fread3d_mod_mp         156  nf_fread3d.f90
oceanG             0000000002EA599B  get_state_                851  get_state.f90
oceanG             0000000000F71736  initial_                  213  initial.f90
oceanG             000000000040C8EF  ocean_control_mod         133  ocean_control.f90
oceanG             000000000040B8B6  MAIN__                     95  master.f90
oceanG             000000000040B68E  Unknown               Unknown  Unknown
libc.so.6          00007F62B53A6D1D  Unknown               Unknown  Unknown
oceanG             000000000040B569  Unknown               Unknown  Unknown
It is failing in the reading of "u", specifically in the floating point attributes of "u". This is a new initial file I made the same way as the last one which ROMS has read many times. The above failure was with ifort, trying again with gfortran doesn't fail at all, so I'm chalking it up to a compiler bug. :?

mathieu
Posts: 74
Joined: Fri Sep 17, 2004 2:22 pm
Location: Institut Rudjer Boskovic

Re: Segmentation Faults

#2 Post by mathieu »

Hi Kate,
in my experience it has never happened that the compiler was wrong. Bug detected for one compiler but not for the other means bug.
In order to detect the bug in gfortran you can use compilation option -fcheck=all -fsanitize=address -fsanitize=undefined.
For Intel Fortran Compiler, options are -check all -warn interfaces,nouncalled -gen-interface.

There are other options for detecting NaN in the computation.

User avatar
kate
Posts: 3995
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Segmentation Faults

#3 Post by kate »

Thanks - those "check all" flags are scary! Both compilers warn about creating temporary arrays when reading parameter files (read_phypar, read_stapar, etc).

Ifort still fails in nf_fread3d when calling netcdf_get_fatt.

gfortran now fails in wclock_on because it is a nonrecursive procedure being called recursively (from the mp_barrier in there).

Gfortran without all the checking still runs.

mathieu
Posts: 74
Joined: Fri Sep 17, 2004 2:22 pm
Location: Institut Rudjer Boskovic

Re: Segmentation Faults

#4 Post by mathieu »

The temporary arrays are when you pass a A(1,:) array to a subroutine. Since the values are not aligned there is a need for a new array which of course slows things down. But it is no problem if done only in the input parameter reading.

It is of course a problem if wclock_on is called recursively. Solution to that is to declare a "RECURSIVE SUBROUTINE".

The fact that the error occurs in netcdf_get_fatt means that the bug happens in the netcdf routine itself. So, two possibilities:

(A) The bug is in the netcdf routine itself (rather unlikely). Then one needs to compile the netcdf itself with check all. Hard work to do that.

(B) Print the input to the function netcdf_get_fatt. Long time ago I had random errors occurring because of pointers erased by a previous call to a function. This pointer erasure can happen before the call to netcdf_get_fatt and create the problem. Since the compilers are free to organize memory as they want this can explain why it can work with gfortran but not for ifort.

User avatar
kate
Posts: 3995
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Segmentation Faults

#5 Post by kate »

I'm happy to ignore warnings during initialization.

Thanks, would have gotten to adding the recursive modifier, but had to leave yesterday. The gfortran case is now running past that.

The netcdf_get_fatt thing happens in the debugger when stepping into netcdf_get_fatt from nf_fread3d.
I can see the values of all eight arguments to netcdf_get_fatt and they are all fine. netcdf_get_fatt is a ROMS routine, so I should be able to step into it but no, that's when the error occurs for ifort.

I've been around long enough to believe in compiler bugs, no question.

mathieu
Posts: 74
Joined: Fri Sep 17, 2004 2:22 pm
Location: Institut Rudjer Boskovic

Re: Segmentation Faults

#6 Post by mathieu »

Hernan, a remark on your point on "wrap-around integer". Actually in Fortran (and C/C++) the integer overflow is undefined behavior. See for example https://stackoverflow.com/questions/405 ... r-overflow
So, gfortran is right to stop at that.

mitya
Posts: 3
Joined: Wed Apr 17, 2013 12:47 pm
Location: NTNU, Trondheim
Contact:

Re: Segmentation Faults

#7 Post by mitya »

Hi Kate,
have you managed to run this app with ifort?
I am building a metroms on recently deployed supercomputer (Betzy, in Norway), and run exactly into the same error the same place (reading attributes for u)
the toolchain I use on new supercomputer is the same as on previous.

Another question -- I remember you also use metroms, have you tried to build metroms with gfortran as well?

Dmitry

User avatar
kate
Posts: 3995
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Segmentation Faults

#8 Post by kate »

Oh gosh, that was two and a half years ago! I have no memory of it whatsoever.

As for metroms, I haven't played with that lately either. I might have to go back to it if I can't get this other monster (CESM) working on our supercomputer.

mitya
Posts: 3
Joined: Wed Apr 17, 2013 12:47 pm
Location: NTNU, Trondheim
Contact:

Re: Segmentation Faults

#9 Post by mitya »

heh, it reminds me this
https://xkcd.com/979/

well, on the bright side you haven't been bothered with this error since then!

Mitya

mitya
Posts: 3
Joined: Wed Apr 17, 2013 12:47 pm
Location: NTNU, Trondheim
Contact:

Re: Segmentation Faults

#10 Post by mitya »

ok, back to the case,
well, in my case this error was due to stack size on our new supercomputer,
so it is solved by setting it to unlimited:
ulimit -s unlimited

Fatima
Posts: 1
Joined: Mon Aug 04, 2014 3:14 pm
Location: Oceanic and atmospheric science center

Re: Segmentation Faults

#11 Post by Fatima »

Hi
I try to run my model in fedora 30 and i used gfortran without mpi. I have this error

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7f92562bfd51 in ???
#1 0x7f92562bef15 in ???
#2 0x7f92555fbf3f in ???
#3 0x7f92565b17f7 in ???
#4 0x319998a in __mod_netcdf_MOD_netcdf_create
at /home/obuntooo/roms/upwelling1/Build_romsG/mod_netcdf.f90:5908
#5 0x2010b19 in def_his_nf90
at /home/obuntooo/roms/upwelling1/Build_romsG/def_his.f90:121
#6 0x20955b0 in __def_his_mod_MOD_def_his
at /home/obuntooo/roms/upwelling1/Build_romsG/def_his.f90:57
#7 0x52a648 in output_
at /home/obuntooo/roms/upwelling1/Build_romsG/output.f90:141
#8 0x41561e in main3d_
at /home/obuntooo/roms/upwelling1/Build_romsG/main3d.f90:235
#9 0x408b39 in __roms_kernel_mod_MOD_roms_run
at /home/obuntooo/roms/upwelling1/Build_romsG/roms_kernel.f90:175
#10 0x40531b in myroms
at /home/obuntooo/roms/upwelling1/Build_romsG/master.f90:86
#11 0x405462 in main
at /home/obuntooo/roms/upwelling1/Build_romsG/master.f90:50
Segmentation fault (core dumped)
please help me to fix it. Thank you.

Post Reply