1

server crash

Тема: server crash

Hello
i have a custom plan for my devices
about 25 device send their data to my server and i will save them to my database and then send a copy of data to gurtam with a python script

the problem is
some times without any warning the server crash and goes down!

i searched the web and some forum guys said this due to max open file of Os but i increased max open file from 1000 to 25000 and nothing changed!!!

server is check for hardware error
server data
dual xeon 5620
16 gig ram
4* ssd hard
port 100 mbps

os : ubuntu 14.10 64bits

is there any idea to resolve this?!

2

server crash

(22/08/2016 11:47:49 отредактировано gaev)

Re: server crash

What is the output for

cat /proc/service_PID/limits
Евгений
WDC Administrator
Gurtam
3

server crash

Re: server crash

out put:

root@gurtam-test:~# cat /proc/service_PID/limits
cat: /proc/service_PID/limits: No such file or directory
root@gurtam-test:~# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 62797
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 62797
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
4

server crash

(22/08/2016 14:27:42 отредактировано gaev)

Re: server crash

open files                      (-n) 1024

not 25000

service_PID  - the PID of your  program

Евгений
WDC Administrator
Gurtam
5

server crash

Re: server crash

gaev пишет:
open files                      (-n) 1024

not 25000

service_PID  - the PID of your  program

this is process limit

root@gurtam-test:~# cat /proc/23906/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            10485760             unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             62797                62797                processes
Max open files            1024                 4096                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       62797                62797                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

how can i check the current usage of my service!??!

6

server crash

(23/08/2016 09:17:30 отредактировано gaev)

Re: server crash

roozgar пишет:

how can i check the current usage of my service!??!

ls -l /proc/pid/fd | wc -l
cat /proc/pid/status 
Евгений
WDC Administrator
Gurtam
7

server crash

(23/08/2016 12:20:35 отредактировано roozgar)

Re: server crash

Ok thank you for your usefull answers,
i have an other critical problem too


this is full of my python code

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import socket
import os
import threading
import mysql.connector
import time
import datetime
import requests
import sys
#
Dconn = mysql.connector.connect(user='root',password='aaaaa',host='aaaaa',database='aaaa',charset='utf8',autocommit=True)
cursor = Dconn.cursor(buffered=True)
cursor2 = Dconn.cursor(buffered=True)
print('databased stablished.')
def sendData(protocol,server,port,sig):
    print((str(datetime.datetime.now()))+' -> ( '+format(protocol)+' ) '+format(server)+':'+format(port)+' - dbId : '+format(sig[0])+' tId : '+threading.current_thread().name)
    cursor2.execute("select * from raw join devices on devices.dvid=raw.device where raw.rdid='"+format(sig[1])+"' limit 1;")
    sendSig = cursor2.fetchone()
#    print(sendSig)
    timedata = datetime.datetime.fromtimestamp(int(sendSig[12]))
    devimei = sendSig[18]
    devdate = timedata.strftime("%d%m%y")
    devtime = timedata.strftime("%H%M%S")
    lat= format(sendSig[2])
    lon= format(sendSig[3])
    satcount = format(sendSig[5])
    speed = format(sendSig[4])
    batery = format(sendSig[8])
    band = format(sendSig[10])
    if protocol=='tcp':
        try:
            sistr='$MGV002,'+devimei+',12345,S,'+devdate+','+devtime+',A,'+lat+',N,'+lon+',E,0,'+satcount+',00,1.95,'+speed+',0,,,432,35,0,0,5,,,,,,,,,0'+band+','+batery+',Timer;!'
            clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            result = clientsocket.connect((server,port))
#            clientsocket.settimeout(4)
            clientsocket.send(sistr)
            data = clientsocket.recv(1024)
            print(str(datetime.datetime.now())+' -> send completed :'+format(sig[0]))
            clientsocket.close()
        except:
            print(str(datetime.datetime.now())+' -> connection to tcp server failed!!')
    elif protocol=='http':
        try:
            sistr='***'+devimei+';123;'+devdate+';'+devtime+';'+lat+';N;'+lon+';E;'+speed+';NA;159;'+satcount+';NA;5;'+batery+';[inputs];[outputs];[adc];[ibutton];[params];***'
            #r = requests.post(server, data={'action': 'update','dev': 'log','ver': '1','val': sistr})
            #r = requests.post(server+'&data='+sistr, data={})
            r = requests.post('http://psimap.ir/WBSRV.php?action=update&dev=log&ver=1&data='+sistr, data={})
            print(str(datetime.datetime.now())+ ' -> Send completed :'+format(sig[0])+' | Response:'+format(r.text))
        except:
            print(str(datetime.datetime.now())+' -> connection to http server failed!!')
    try:
        sqlobject = "UPDATE `row_sent` SET `send_date`=CURRENT_TIMESTAMP WHERE `rsid`='"+format(sig[0])+"' limit 1;"
        cursor2.execute(sqlobject)
    except:
        print(str(datetime.datetime.now())+' -> database rid!!')
        sys.exit(1)
#main
while True:
    cursor.execute("select * from row_sent join servers on servers.sid = row_sent.server_id where result='pending' order by rsid DESC limit 1;")
    if cursor.rowcount>0 :
        waitedSig = cursor.fetchone()
        sqlobject2 = "UPDATE `row_sent` SET `result`='succes' WHERE `rsid`='"+ format(waitedSig[0]) +"' limit 1;"
        cursor.execute(sqlobject2)
        time.sleep(0.2)
        t = threading.Thread(target=sendData , args=((waitedSig[8]),(waitedSig[9]),(waitedSig[10]),(waitedSig),))
        t.start()
        time.sleep(0.5)
    else:
        time.sleep(1)

the script working good at all
read records from database and send data to gurtam server
but sometimes it crashes with out any error and until i run it again manually it dont send anything to gurtam!!

is there in problem in gurtam side?!
is there any solution or useful trick to avoid script break!!?