All times are UTC-06:00




Post new topic  Reply to topic  [ 9 posts ] 
Author Message
PostPosted: Thu Jul 21, 2011 10:28 am 
Offline

Joined: Thu Jul 21, 2011 10:21 am
Posts: 4
I recently started using an Efika MX SmartTop, and it seems to crash rather often under sustained heavy load (~2 hours running at near max CPU). Nothing in /var/log is complaining near the time of the crash.

Is there a way to enable some kind of kernel dumping on panic, or additional diagnostic information that might help me understand what's happening? It might just be the hardware overheating (though it has adequate ventilation).

I am using the stock Ubuntu 10.10 image w/ up-to-date packages.


Top
   
 Post subject:
PostPosted: Thu Jul 21, 2011 2:25 pm 
Offline
Genesi

Joined: Mon Jan 30, 2006 2:28 am
Posts: 409
Location: Finland
Hi.

Could you let me know what exactly you're running that causes the crash? We have had several systems running over long periods of time under heavy load without problems.

If you have a specific piece of software that you're running, and it always happens around the 2 hour mark, I am inclined to think it is an issue with that particular software.

If you can provide more info here, we can try to duplicate your issue.


Johan.

_________________
Johan Dams, Genesi USA Inc.
Director, Software Engineering

Yep, I have a blog... PurpleAlienPlanet


Top
   
 Post subject:
PostPosted: Thu Jul 21, 2011 3:53 pm 
Offline

Joined: Thu Jul 21, 2011 10:21 am
Posts: 4
It is a large test suite that was developed internally, so unfortunately I can't distribute it. It is running as my local user--no root permissions, no kernel mode. I've not installed any drivers or anything odd, however I *am* heavily reading from a Western Digital external hard drive via one of the USB 2.0 ports on the front of the device.

The crashing isn't 100% repeatable or reliable, it's simply happened multiple times, and all of those times it's been while running this particularly resource-hungry application for several hours.

I am curious, by what mechanism (other than a fork bomb, or perhaps using up all disk space) should a user-level process be able to bring down the system to the point where it cannot even respond to a ping? Is there some resource it might be starving the system of completely? The system is perfectly usable while this is running until it freezes. I've got plenty of disk space, however the process does hit the RAM rather heavily.


Top
   
 Post subject:
PostPosted: Thu Jul 21, 2011 4:09 pm 
Offline
Genesi

Joined: Mon Jan 30, 2006 2:28 am
Posts: 409
Location: Finland
Hi.
Quote:
I *am* heavily reading from a Western Digital external hard drive via one of the USB 2.0 ports on the front of the device
I take it that this drive has its own power supply?
Quote:
all of those times it's been while running this particularly resource-hungry application for several hours
Have you tried any other resource hungry operations, like for instance a kernel compile?
Quote:
I am curious, by what mechanism (other than a fork bomb, or perhaps using up all disk space) should a user-level process be able to bring down the system to the point where it cannot even respond to a ping?
If you have no user limits set, a memory leak can do that. Try monitoring memory usage and check if it starts swapping heavily.


Johan.

_________________
Johan Dams, Genesi USA Inc.
Director, Software Engineering

Yep, I have a blog... PurpleAlienPlanet


Top
   
 Post subject:
PostPosted: Thu Jul 21, 2011 4:22 pm 
Offline

Joined: Thu Jul 21, 2011 10:21 am
Posts: 4
Quote:
Hi.
Quote:
I *am* heavily reading from a Western Digital external hard drive via one of the USB 2.0 ports on the front of the device
I take it that this drive has its own power supply?
Quote:
all of those times it's been while running this particularly resource-hungry application for several hours
Have you tried any other resource hungry operations, like for instance a kernel compile?
Quote:
I am curious, by what mechanism (other than a fork bomb, or perhaps using up all disk space) should a user-level process be able to bring down the system to the point where it cannot even respond to a ping?
If you have no user limits set, a memory leak can do that. Try monitoring memory usage and check if it starts swapping heavily.


Johan.
Yes, the drive has its own power supply, that was one of the requirements for the drive.

As for other intensive processes, I've run compiles of our own internal software (C/C++) that take ~3-5 hours or so to complete, and this does not cause any adverse effects.

I could believe that memory leaks were possible in the program we are compiling, as the allocator it uses was originally designed for use on intel processors, and has had some issues with alignment differences on ARM. Is there some sign I should see post-crash that would indicate that this is taking place?

I can try ulimiting the test. What memory size would you recommend I limit it to?


Top
   
 Post subject:
PostPosted: Thu Jul 21, 2011 4:34 pm 
Offline
Genesi

Joined: Mon Jan 30, 2006 2:28 am
Posts: 409
Location: Finland
Hi.
Quote:
I could believe that memory leaks were possible in the program we are compiling, as the allocator it uses was originally designed for use on intel processors, and has had some issues with alignment differences on ARM. Is there some sign I should see post-crash that would indicate that this is taking place?

I can try ulimiting the test. What memory size would you recommend I limit it to?
If you are really running low on Ram and also swap space due to a memory leak or just your app's requirements, it should be easily noticed. You can write a script that logs memory usage and resources every so often to see what happens.

As for a proper ulimit, I'd try to get it done without using swap at all.

Johan.

_________________
Johan Dams, Genesi USA Inc.
Director, Software Engineering

Yep, I have a blog... PurpleAlienPlanet


Top
   
 Post subject:
PostPosted: Fri Jul 22, 2011 10:47 am 
Offline

Joined: Thu Jul 21, 2011 10:21 am
Posts: 4
I reran the test suite again, this time without encountering any problems -- the code was unchanged, however I did have the SmartTop up on stilts to give it more passive cooling capacity. This sample size is too small, so I'll be running an aggressive set of tests this weekend.

Is there any way to access internal temperature sensors? Also, does the SmartTop throttle its processor under the 10.10 distribution, and can that be monitored/controlled?

I'll monitor the memory situation, as well as open file handles, etc, and see if there's any evidence of resource starvation during the testing process.

Thanks!


Top
   
 Post subject:
PostPosted: Fri Jul 22, 2011 11:58 am 
Offline
Genesi

Joined: Mon Jan 30, 2006 2:28 am
Posts: 409
Location: Finland
Hi.

As far as I remember, there are no temperature sensors and no CPU throttling is done.
Lack of cooling should not be an issue, unless you are operating in high ambient temperatures.


Johan.

_________________
Johan Dams, Genesi USA Inc.
Director, Software Engineering

Yep, I have a blog... PurpleAlienPlanet


Top
   
 Post subject:
PostPosted: Sat Jul 23, 2011 8:43 pm 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
Hi.

As far as I remember, there are no temperature sensors and no CPU throttling is done.
Lack of cooling should not be an issue, unless you are operating in high ambient temperatures.


Johan.
Check "lsmod" for cpufreq_ondemand; if it's loaded, it will be clocking the CPU back to 160MHz at certain points to reduce CPU power consumption.

You can blacklist cpufreq_ondemand (this is actually done on the newest image, released yesterday) for Smarttop and see if it makes any difference. It should greatly improve performance of the Smarttop in general as cpufreq_ondemand is notorious for underclocking the CPU at random times. On Smartbook we enable it because we want it to get better battery life but we are investigating whether we want or need CPU power management to get it.

_________________
Matt Sealey


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 9 posts ] 

All times are UTC-06:00


Who is online

Users browsing this forum: No registered users and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron
PowerDeveloper.org: Copyright © 2004-2012, Genesi USA, Inc. The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
All other names and trademarks used are property of their respective owners. Privacy Policy
Powered by phpBB® Forum Software © phpBB Group