ArcGIS Desktop에서 Python 계산 타임 스탬프 필드 속도를 높이시겠습니까?

9

저는 Python을 처음 사용하고 ArcGIS 워크 플로우를위한 스크립트를 만들기 시작했습니다. 타임 스탬프 필드에서 "Hours"이중 숫자 필드를 생성하기 위해 코드 속도를 높이는 방법이 궁금합니다. DNR Garmin이 생성 한 트랙 포인트 로그 (빵 부스러기 트레일) 모양 파일부터 시작하여 각 트랙 포인트 레코드가 촬영 된 시간에 대한 LTIME 타임 스탬프 필드 (텍스트 필드, 길이 20)를 사용합니다. 스크립트는 각 연속 타임 스탬프 ( "LTIME") 사이의 시간 차이를 계산하여 새 필드 ( "Hours")에 넣습니다.

그런 식으로 돌아가서 특정 지역 / 폴리곤에서 보낸 시간을 요약 할 수 있습니다. 주요 부분은 다음과 print "Executing getnextLTIME.py script..."같습니다 코드 는 다음과 같습니다.

# ---------------------------------------------------------------------------
# 
# Created on: Sept 9, 2010
# Created by: The Nature Conservancy
# Calculates delta time (hours) between successive rows based on timestamp field
#
# Credit should go to Richard Crissup, ESRI DTC, Washington DC for his
# 6-27-2008 date_diff.py posted as an ArcScript
'''
    This script assumes the format "month/day/year hours:minutes:seconds".
    The hour needs to be in military time. 
    If you are using another format please alter the script accordingly. 
    I do a little checking to see if the input string is in the format
    "month/day/year hours:minutes:seconds" as this is a common date time
    format. Also the hours:minute:seconds is included, otherwise we could 
    be off by almost a day.

    I am not sure if the time functions do any conversion to GMT, 
    so if the times passed in are in another time zone than the computer
    running the script, you will need to pad the time given back in 
    seconds by the difference in time from where the computer is in relation
    to where they were collected.

'''
# ---------------------------------------------------------------------------
#       FUNCTIONS
#----------------------------------------------------------------------------        
import arcgisscripting, sys, os, re
import time, calendar, string, decimal
def func_check_format(time_string):
    if time_string.find("/") == -1:
        print "Error: time string doesn't contain any '/' expected format \
            is month/day/year hour:minutes:seconds"
    elif time_string.find(":") == -1:
        print "Error: time string doesn't contain any ':' expected format \
            is month/day/year hour:minutes:seconds"

        list = time_string.split()
        if (len(list)) <> 2:
            print "Error time string doesn't contain and date and time separated \
                by a space. Expected format is 'month/day/year hour:minutes:seconds'"


def func_parse_time(time_string):
'''
    take the time value and make it into a tuple with 9 values
    example = "2004/03/01 23:50:00". If the date values don't look like this
    then the script will fail. 
'''
    year=0;month=0;day=0;hour=0;minute=0;sec=0;
    time_string = str(time_string)
    l=time_string.split()
    if not len(l) == 2:
        gp.AddError("Error: func_parse_time, expected 2 items in list l got" + \
            str(len(l)) + "time field value = " + time_string)
        raise Exception 
    cal=l[0];cal=cal.split("/")
    if not len(cal) == 3:
        gp.AddError("Error: func_parse_time, expected 3 items in list cal got " + \
            str(len(cal)) + "time field value = " + time_string)
        raise Exception
    ti=l[1];ti=ti.split(":")
    if not len(ti) == 3:
        gp.AddError("Error: func_parse_time, expected 3 items in list ti got " + \
            str(len(ti)) + "time field value = " + time_string)
        raise Exception
    if int(len(cal[0]))== 4:
        year=int(cal[0])
        month=int(cal[1])
        day=int(cal[2])
    else:
        year=int(cal[2])
        month=int(cal[0])
        day=int(cal[1])       
    hour=int(ti[0])
    minute=int(ti[1])
    sec=int(ti[2])
    # formated tuple to match input for time functions
    result=(year,month,day,hour,minute,sec,0,0,0)
    return result


#----------------------------------------------------------------------------

def func_time_diff(start_t,end_t):
    '''
    Take the two numbers that represent seconds
    since Jan 1 1970 and return the difference of
    those two numbers in hours. There are 3600 seconds
    in an hour. 60 secs * 60 min   '''

    start_secs = calendar.timegm(start_t)
    end_secs = calendar.timegm(end_t)

    x=abs(end_secs - start_secs)
    #diff = number hours difference
    #as ((x/60)/60)
    diff = float(x)/float(3600)   
    return diff

#----------------------------------------------------------------------------

print "Executing getnextLTIME.py script..."

try:
    gp = arcgisscripting.create(9.3)

    # set parameter to what user drags in
    fcdrag = gp.GetParameterAsText(0)
    psplit = os.path.split(fcdrag)

    folder = str(psplit[0]) #containing folder
    fc = str(psplit[1]) #feature class
    fullpath = str(fcdrag)

    gp.Workspace = folder

    fldA = gp.GetParameterAsText(1) # Timestamp field
    fldDiff = gp.GetParameterAsText(2) # Hours field

    # set the toolbox for adding the field to data managment
    gp.Toolbox = "management"
    # add the user named hours field to the feature class
    gp.addfield (fc,fldDiff,"double")
    #gp.addindex(fc,fldA,"indA","NON_UNIQUE", "ASCENDING")

    desc = gp.describe(fullpath)
    updateCursor = gp.UpdateCursor(fullpath, "", desc.SpatialReference, \
        fldA+"; "+ fldDiff, fldA)
    row = updateCursor.Next()
    count = 0
    oldtime = str(row.GetValue(fldA))
    #check datetime to see if parseable
    func_check_format(oldtime)
    gp.addmessage("Calculating " + fldDiff + " field...")

    while row <> None:
        if count == 0:
            row.SetValue(fldDiff, 0)
        else:
            start_t = func_parse_time(oldtime)
            b = str(row.GetValue(fldA))
            end_t = func_parse_time(b)
            diff_hrs = func_time_diff(start_t, end_t)
            row.SetValue(fldDiff, diff_hrs)
            oldtime = b

        count += 1
        updateCursor.UpdateRow(row)
        row = updateCursor.Next()

    gp.addmessage("Updated " +str(count+1)+ " rows.")
    #gp.removeindex(fc,"indA")
    del updateCursor
    del row

except Exception, ErrDesc:
    import traceback;traceback.print_exc()

print "Script complete."

arcpy time

— 러셀
소스

1

좋은 프로그램! 계산 속도를 높이는 것을 보지 못했습니다. 필드 계산기는 영원히 걸립니다!

— 브래드 네섬

12

지리 처리 환경에서 커서는 항상 느립니다. 이를 해결하는 가장 쉬운 방법은 Python 코드 블록을 CalculateField 지오 프로세싱 도구에 전달하는 것입니다.

이와 같은 것이 작동해야합니다.

import arcgisscripting
gp = arcgisscripting.create(9.3)

# Create a code block to be executed for each row in the table
# The code block is necessary for anything over a one-liner.
codeblock = """
import datetime
class CalcDiff(object):
    # Class attributes are static, that is, only one exists for all 
    # instances, kind of like a global variable for classes.
    Last = None
    def calcDiff(self,timestring):
        # parse the time string according to our format.
        t = datetime.datetime.strptime(timestring, '%m/%d/%Y %H:%M:%S')
        # return the difference from the last date/time
        if CalcDiff.Last:
            diff =  t - CalcDiff.Last
        else:
            diff = datetime.timedelta()
        CalcDiff.Last = t
        return float(diff.seconds)/3600.0
"""

expression = """CalcDiff().calcDiff(!timelabel!)"""

gp.CalculateField_management(r'c:\workspace\test.gdb\test','timediff',expression,   "PYTHON", codeblock)

분명히 필드와 매개 변수와 같은 필드를 사용하도록 수정해야하지만 꽤 빠릅니다.

날짜 / 시간 구문 분석 함수는 실제로 strptime () 함수보다 머리카락이 빠르지 만 표준 라이브러리에는 거의 항상 버그가 없습니다.

— 데이비드
소스

고마워 데이빗. CalculateField가 더 빠르다는 것을 몰랐습니다. 나는 이것을 테스트하려고 노력할 것이다. 내가 생각할 수있는 유일한 문제는 데이터 세트가 잘못되었을 수 있다는 것입니다. 때때로 이런 일이 발생합니다. LTIME 필드에서 오름차순으로 정렬 한 다음 CalculateField를 적용하거나 CalculateField에 특정 순서로 실행하도록 지시하는 방법이 있습니까?

— Russell

미리 작성된 gp 함수를 호출하면 대부분의 시간이 더 빠릅니다. 나는 설명 왜 이전 포스트에서 gis.stackexchange.com/questions/8186/...

— 라기 Yaser Burhum

뛰어난 기능을 제공하고 시간 / 달력 패키지를 거의 대체 하므로 datetime 내장 패키지 를 사용하여 +1

— Mike T

1

그것은 믿어지지 않았다! 코드를 시험해 보았고 @OptimizePrime의 "in memory"제안과 통합했으며 스크립트의 평균 실행 시간이 55 초에서 2 초 (810 레코드)로 걸렸습니다. 이것은 내가 찾던 종류입니다. 정말 고맙습니다. 나는 많이 배웠다.

— Russell

3

@David는 매우 깨끗한 솔루션을 제공했습니다. arcgisscripting 코드베이스의 장점을 사용하기 위해 +1.

다른 옵션은 다음을 사용하여 데이터 세트를 메모리에 복사하는 것입니다.

gp.CopyFeatureclass ( "소스의 경로", "in_memory \ copied 기능 이름")-지오 데이터베이스 기능 클래스, shapefile 또는
gp.CopyRows ( "소스의 경로",)-지오 데이터베이스 테이블, dbf 등

이는 ESRI COM 코드베이스에서 커서를 요청할 때 발생하는 오버 헤드를 제거합니다.

파이썬 데이터 형식을 C 데이터 형식으로 변환하고 ESRI COM 코드베이스에 액세스하면 오버 헤드가 발생합니다.

메모리에 데이터가 있으면 디스크 액세스 필요성이 줄어 듭니다 (고비용 프로세스). 또한 arcgisscripting을 사용할 때 파이썬 및 C / C ++ 라이브러리가 데이터를 전송할 필요성을 줄입니다.

도움이 되었기를 바랍니다.

— OptimizePrime
소스

1

ArcGIS 10.1 for Desktop 이후 사용 가능했던 arcgisscripting에서 구식 UpdateCursor를 사용하는 것에 대한 훌륭한 대안은 arcpy.da.UpdateCursor 입니다.

나는 이것들이 일반적으로 약 10 배 빠르다는 것을 발견했다.

이 질문이 작성되었을 때 옵션이 아니었을 수도 있지만 지금이 Q & A를 읽고있는 사람이 간과해서는 안됩니다.

— 폴리 지오
소스