2014. 3. 18. 17:38, 툴 정보 및 사용법/Python
LCS는 알고리즘으로 가장 긴 공통 문자열을 찾는다.
*Longest Common Substring(subsequence)
def longest_common_substring(s1, s2):
m = [[0] * (1 + len(s2)) for i in xrange(1 + len(s1))]
longest, x_longest = 0, 0
for x in xrange(1, 1 + len(s1)):
for y in xrange(1, 1 + len(s2)):
if s1[x - 1] == s2[y - 1]:
m[x][y] = m[x - 1][y - 1] + 1
if m[x][y] > longest:
longest = m[x][y]
x_longest = x
else:
m[x][y] = 0
return s1[x_longest - longest: x_longest]
m = [[0] * (1 + len(s2)) for i in xrange(1 + len(s1))]
longest, x_longest = 0, 0
for x in xrange(1, 1 + len(s1)):
for y in xrange(1, 1 + len(s2)):
if s1[x - 1] == s2[y - 1]:
m[x][y] = m[x - 1][y - 1] + 1
if m[x][y] > longest:
longest = m[x][y]
x_longest = x
else:
m[x][y] = 0
return s1[x_longest - longest: x_longest]
def main():
a = "12345123jdfjdfjdfjdf"
b= "s902384jdf9123 jdfj"
print longest_common_substring(a, b)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print "User aborted."
except SystemExit:
pass
LRS는 가장 긴 반복 문자열을 찾는다.
*Longest Repeated Substring
from collections import Counter
a='aaabbbsdfsdfasdfbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbaaaa34243sfsaef'
times=3
for n in range(1,len(a)/times+1)[::-1]:
substrings=[a[i:i+n] for i in range(len(a)-n+1)]
freqs=Counter(substrings)
if freqs.most_common(1)[0][1]>=3:
seq=freqs.most_common(1)[0][0]
break
print "sequence '%s' of length %s occurs %s or more times"%(seq,n,times)
'툴 정보 및 사용법 > Python' 카테고리의 다른 글
윈도우용 파이썬 패키지 설치할 때 알아두어야 할 것 (0) | 2014.05.23 |
---|---|
Pyhon colored output string (0) | 2014.04.02 |
파이썬 팁 모음 (0) | 2014.02.25 |
PYTHON 파이똥 한글 인코딩 문제 참고.. (0) | 2013.09.27 |
python version 2.6.5 -> 2.7.2 (using pythonbrew) (0) | 2013.07.18 |
Comments, Trackbacks