Analisis Frekuensi (Frequency Analysis) Analisis I...
Creato il: 29 maggio 2026
Risposto usando GPT-5.5 Thinking da Chat01
Creato il: 29 maggio 2026
Risposto usando GPT-5.5 Thinking da Chat01
Analisis Frekuensi (Frequency Analysis)
Analisis Interval (Gap Analysis)
Analisis Pola Digit (Digit Pattern Analysis)
Time Series Analysis (Analisis Deret Waktu)
Pendekatan Probabilitas (Probability Theory)
pythonimport pandas as pd, numpy as np, re, os, textwrap, json, math, statistics path="/mnt/data/data_history.txt" # read df=pd.read_csv(path, dtype=str) df.head(), df.shape, df.columns.tolist()[:5], df.tail() ( Tanggal Waktu 1st Prize 2nd Prize ... Cons2 Cons3 Cons4 Cons5 0 10 Februari 2026 21:30 6112 0236 ... 2500 6393 7533 9150 1 11 Februari 2026 03:30 9876 2992 ... 7518 7253 9528 2514 2 11 Februari 2026 06:30 2279 9443 ... 8064 8957 3583 9130 3 11 Februari 2026 11:30 7453 8459 ... 3498 9558 7317 0120 4 11 Februari 2026 13:30 7814 6407 ... 9497 5878 4385 1936 [5 rows x 15 columns], (639, 15), ['Tanggal', 'Waktu', '1st Prize', '2nd Prize', '3rd Prize'], Tanggal Waktu 1st Prize 2nd Prize ... Cons2 Cons3 Cons4 Cons5 634 28 Mei 2026 13:30 9454 2091 ... 3397 3112 5501 4833 635 28 Mei 2026 17:30 5132 6039 ... 7173 6501 7531 9170 636 28 Mei 2026 21:30 9276 6327 ... 6856 6071 9093 9312 637 29 Mei 2026 03:30 9506 5912 ... 4509 3969 0186 2087 638 29 Mei 2026 06:30 1808 8976 ... 8135 5670 9684 4057 [5 rows x 15 columns])
pythonfrom datetime import datetime month_map = {'Januari':1,'Februari':2,'Maret':3,'April':4,'Mei':5,'Juni':6,'Juli':7,'Agustus':8,'September':9,'Oktober':10,'November':11,'Desember':12} def parse_ind_date(s): parts=s.split() return datetime(int(parts[2]), month_map[parts[1]], int(parts[0])) df['date']=df['Tanggal'].apply(parse_ind_date) df['dt']=pd.to_datetime(df['date'].dt.strftime('%Y-%m-%d') + ' ' + df['Waktu']) num_cols = [c for c in df.columns if c not in ['Tanggal','Waktu','date','dt']] num_cols ['1st Prize', '2nd Prize', '3rd Prize', 'Lucky1', 'Lucky2', 'Lucky3', 'Lucky4', 'Lucky5', 'Cons1', 'Cons2', 'Cons3', 'Cons4', 'Cons5']
python# Create long data of all numeric outputs long = df.melt(id_vars=['Tanggal','Waktu','date','dt'], value_vars=num_cols, var_name='category', value_name='num') long['num']=long['num'].str.zfill(4) long['row_idx']=long.index // len(num_cols) # wrong melt order? melt groups by col, not row. Let's assign after sorting by dt and category order. # Sort by datetime and col order so each entry order corresponds row then category order cat_order = {c:i for i,c in enumerate(num_cols)} long['cat_ord']=long['category'].map(cat_order) long = long.sort_values(['dt','cat_ord']).reset_index(drop=True) long['seq_idx']=np.arange(len(long)) long['digits'] = long['num'].apply(list) for i in range(4): long[f'd{i+1}']=long['num'].str[i] long['first2']=long['num'].str[:2] long['last2']=long['num'].str[2:] long['sum_digits']=long['num'].apply(lambda x: sum(map(int,x))) long['unique_count']=long['num'].apply(lambda x: len(set(x))) long['has_repeat']=long['unique_count']<4 long['all_same']=long['unique_count']==1 long['palindrome']=long['num'].apply(lambda x: x==x[::-1]) long['ascending_seq']=long['num'].apply(lambda x: all((int(x[i+1])-int(x[i])==1) for i in range(3))) long['descending_seq']=long['num'].apply(lambda x: all((int(x[i])-int(x[i+1])==1) for i in range(3))) long['even_odd_pattern']=long['num'].apply(lambda x: ''.join('E' if int(d)%2==0 else 'O' for d in x)) long['high_low_pattern']=long['num'].apply(lambda x: ''.join('H' if int(d)>=5 else 'L' for d in x)) len(long), long.head(), long.tail() (8307, Tanggal Waktu ... even_odd_pattern high_low_pattern 0 10 Februari 2026 21:30 ... EOOE HLLL 1 10 Februari 2026 21:30 ... EEOE LLLH 2 10 Februari 2026 21:30 ... OOEO HLLL 3 10 Februari 2026 21:30 ... OEOE HLHH 4 10 Februari 2026 21:30 ... OOEE LLLH [5 rows x 25 columns], Tanggal Waktu ... even_odd_pattern high_low_pattern 8302 29 Mei 2026 06:30 ... EOEE HHLL 8303 29 Mei 2026 06:30 ... EOOO HLLH 8304 29 Mei 2026 06:30 ... OEOE HHHL 8305 29 Mei 2026 06:30 ... OEEE HHHL 8306 29 Mei 2026 06:30 ... EEOO LLHH [5 rows x 25 columns])
python# Basic counts basic = { 'rows': len(df), 'entries': len(long), 'start': df['dt'].min(), 'end': df['dt'].max(), 'date_count': df['date'].nunique(), 'time_counts': df['Waktu'].value_counts().sort_index().to_dict(), } basic {'rows': 639, 'entries': 8307, 'start': Timestamp('2026-02-10 21:30:00'), 'end': Timestamp('2026-05-29 06:30:00'), 'date_count': 109, 'time_counts': {'03:30': 108, '06:30': 108, '11:30': 107, '13:30': 101, '17:30': 107, '21:30': 108}}
python# Frequency exact numbers overall and by 1st prize exact_counts = long['num'].value_counts() top_exact = exact_counts.head(15) top_exact_df = top_exact.rename_axis('num').reset_index(name='count') top_exact_df['pct'] = top_exact_df['count']/len(long)*100 top_exact_df.head(15) num count pct 0 3462 6 0.072228 1 7341 6 0.072228 2 7359 5 0.060190 3 3744 5 0.060190 4 2895 5 0.060190 5 8294 5 0.060190 6 4796 5 0.060190 7 8218 5 0.060190 8 9178 5 0.060190 9 7048 5 0.060190 10 4517 5 0.060190 11 1570 5 0.060190 12 3345 5 0.060190 13 6284 5 0.060190 14 1471 4 0.048152
pythonprize1_counts = df['1st Prize'].str.zfill(4).value_counts() top_prize1 = prize1_counts.head(15).rename_axis('num').reset_index(name='count') top_prize1['pct'] = top_prize1['count']/len(df)*100 top_prize1.head(15) num count pct 0 7059 2 0.312989 1 2193 2 0.312989 2 5194 2 0.312989 3 1294 2 0.312989 4 1236 2 0.312989 5 1797 2 0.312989 6 2619 2 0.312989 7 7232 2 0.312989 8 3145 2 0.312989 9 6691 2 0.312989 10 9282 2 0.312989 11 3191 2 0.312989 12 2231 2 0.312989 13 9676 2 0.312989 14 9265 2 0.312989
python# digit frequencies overall digits_series = pd.Series([d for s in long['num'] for d in s]) digit_counts = digits_series.value_counts().sort_index() digit_freq_df = digit_counts.rename_axis('digit').reset_index(name='count') digit_freq_df['pct'] = digit_freq_df['count']/len(digits_series)*100 digit_freq_df digit count pct 0 0 3273 9.850126 1 1 3420 10.292524 2 2 3347 10.072830 3 3 3283 9.880221 4 4 3356 10.099916 5 5 3284 9.883231 6 6 3317 9.982545 7 7 3376 10.160106 8 8 3280 9.871193 9 9 3292 9.907307
python# position digit frequencies pos_freq = {} for i in range(4): counts=long[f'd{i+1}'].value_counts().sort_index() pos_freq[f'pos{i+1}']=counts pos_freq_df=pd.DataFrame(pos_freq).fillna(0).astype(int) pos_pct_df=pos_freq_df.apply(lambda col: col/col.sum()*100, axis=0) pos_freq_df, pos_pct_df.round(2) ( pos1 pos2 pos3 pos4 0 828 819 814 812 1 840 835 894 851 2 845 862 823 817 3 861 838 787 797 4 861 794 828 873 5 803 867 805 809 6 830 855 785 847 7 833 820 883 840 8 799 807 823 851 9 807 810 865 810, pos1 pos2 pos3 pos4 0 9.97 9.86 9.80 9.77 1 10.11 10.05 10.76 10.24 2 10.17 10.38 9.91 9.84 3 10.36 10.09 9.47 9.59 4 10.36 9.56 9.97 10.51 5 9.67 10.44 9.69 9.74 6 9.99 10.29 9.45 10.20 7 10.03 9.87 10.63 10.11 8 9.62 9.71 9.91 10.24 9 9.71 9.75 10.41 9.75)
python# Top first2 and last2 first2_counts=long['first2'].value_counts() last2_counts=long['last2'].value_counts() top_first2=first2_counts.head(10).rename_axis('first2').reset_index(name='count') top_last2=last2_counts.head(10).rename_axis('last2').reset_index(name='count') top_first2['pct']=top_first2['count']/len(long)*100 top_last2['pct']=top_last2['count']/len(long)*100 top_first2, top_last2 ( first2 count pct 0 70 105 1.263994 1 68 102 1.227880 2 82 100 1.203804 3 75 100 1.203804 4 48 99 1.191766 5 45 99 1.191766 6 76 98 1.179728 7 03 96 1.155652 8 42 95 1.143614 9 17 95 1.143614, last2 count pct 0 76 121 1.456603 1 94 113 1.360299 2 48 100 1.203804 3 19 99 1.191766 4 13 97 1.167690 5 71 97 1.167690 6 70 96 1.155652 7 93 96 1.155652 8 28 95 1.143614 9 84 95 1.143614)
python# pattern analysis pattern_summary = { 'has_repeat_count': int(long['has_repeat'].sum()), 'has_repeat_pct': long['has_repeat'].mean()*100, 'all_unique_count': int((~long['has_repeat']).sum()), 'all_unique_pct': (~long['has_repeat']).mean()*100, 'all_same_count': int(long['all_same'].sum()), 'all_same_pct': long['all_same'].mean()*100, 'palindrome_count': int(long['palindrome'].sum()), 'palindrome_pct': long['palindrome'].mean()*100, 'ascending_seq_count': int(long['ascending_seq'].sum()), 'descending_seq_count': int(long['descending_seq'].sum()), } pattern_summary {'has_repeat_count': 3788, 'has_repeat_pct': np.float64(45.60009630432166), 'all_unique_count': 4519, 'all_unique_pct': np.float64(54.39990369567834), 'all_same_count': 8, 'all_same_pct': np.float64(0.09630432165643434), 'palindrome_count': 70, 'palindrome_pct': np.float64(0.8426628144938004), 'ascending_seq_count': 7, 'descending_seq_count': 6}
python# repeated count distribution uniq_dist = long['unique_count'].value_counts().sort_index() # unique count 1..4 uniq_dist_df = uniq_dist.rename_axis('unique_digits').reset_index(name='count') uniq_dist_df['pct']=uniq_dist_df['count']/len(long)*100 # even/odd pattern top, high/low top eo_top=long['even_odd_pattern'].value_counts().head(10).rename_axis('pattern').reset_index(name='count') eo_top['pct']=eo_top['count']/len(long)*100 hl_top=long['high_low_pattern'].value_counts().head(10).rename_axis('pattern').reset_index(name='count') hl_top['pct']=hl_top['count']/len(long)*100 # sum_digits distribution summary and top sums sum_stats=long['sum_digits'].describe() top_sums=long['sum_digits'].value_counts().sort_values(ascending=False).head(10).rename_axis('sum_digits').reset_index(name='count') top_sums['pct']=top_sums['count']/len(long)*100 uniq_dist_df, eo_top, hl_top, sum_stats, top_sums ( unique_digits count pct 0 1 8 0.096304 1 2 422 5.080053 2 3 3358 40.423739 3 4 4519 54.399904, pattern count pct 0 EEOO 554 6.669074 1 EEOE 551 6.632960 2 OOOE 549 6.608884 3 OOEE 539 6.488504 4 EOEE 538 6.476466 5 OEEE 534 6.428313 6 EOOO 534 6.428313 7 OEOE 523 6.295895 8 EEEO 516 6.211629 9 OEOO 514 6.187553, pattern count pct 0 LLHL 577 6.945949 1 LHLH 561 6.753341 2 LLLH 550 6.620922 3 LHHL 545 6.560732 4 HHLL 542 6.524618 5 HLHL 541 6.512580 6 LHLL 522 6.283857 7 LLHH 519 6.247743 8 HHLH 517 6.223667 9 HLHH 507 6.103286, count 8307.000000 mean 17.961839 std 5.623393 min 0.000000 25% 14.000000 50% 18.000000 75% 22.000000 max 36.000000 Name: sum_digits, dtype: float64, sum_digits count pct 0 17 575 6.921873 1 19 574 6.909835 2 18 563 6.777417 3 16 529 6.368123 4 15 529 6.368123 5 20 508 6.115324 6 21 492 5.922716 7 22 449 5.405080 8 14 423 5.092091 9 13 411 4.947635)
python# Gap analysis functions def gap_stats_for_series(series, idx=None): # series indexed by seq_idx implicitly if idx is None: idx = np.arange(len(series)) records=[] for val, positions in pd.Series(idx).groupby(series).agg(list).items(): pos=np.array(positions, dtype=int) gaps=np.diff(pos) records.append({ 'value': val, 'count': len(pos), 'first_idx': int(pos.min()), 'last_idx': int(pos.max()), 'current_gap': int(idx[-1] - pos.max()), 'avg_gap': float(np.mean(gaps)) if len(gaps)>1 else np.nan, 'max_gap': int(np.max(gaps)) if len(gaps)>1 else np.nan, }) return pd.DataFrame(records) exact_gap = gap_stats_for_series(long['num'].values, long['seq_idx'].values) # filter count >=3 overdue_exact = exact_gap[exact_gap['count']>=3].sort_values(['current_gap','count'], ascending=[False,False]).head(15) recent_exact = exact_gap[exact_gap['count']>=3].sort_values(['current_gap','count'], ascending=[True,False]).head(15) overdue_exact, recent_exact ( value count first_idx last_idx current_gap avg_gap max_gap 1420 2500 3 9 1031 7275 511.0 646.0 3852 6814 3 1408 1918 6388 255.0 461.0 24 0042 3 207 1975 6331 884.0 1577.0 680 1206 3 784 2039 6267 627.5 699.0 4596 8146 3 1808 2129 6177 160.5 220.0 4330 7636 3 713 2130 6176 708.5 1232.0 2889 5108 3 1416 2151 6155 367.5 435.0 529 0946 3 351 2307 5999 978.0 1867.0 2699 4761 3 321 2351 5955 1015.0 1617.0 3822 6757 3 227 2587 5719 1180.0 1814.0 3731 6589 3 256 2644 5662 1194.0 1565.0 5385 9528 3 24 2658 5648 1317.0 1666.0 5154 9138 3 1486 2667 5639 590.5 813.0 1260 2208 3 1765 2695 5611 465.0 473.0 2492 4379 3 329 2804 5502 1237.5 1910.0, value count first_idx last_idx current_gap avg_gap max_gap 2306 4057 4 1159 8306 0 2382.333333 3550.0 5465 9684 3 4352 8305 1 1976.500000 2975.0 364 0638 3 4288 8300 6 2006.000000 2856.0 963 1702 3 3132 8297 9 2582.500000 4852.0 2564 4509 3 6482 8290 16 904.000000 1652.0 3349 5912 3 4840 8282 24 1721.000000 3043.0 632 1122 3 2018 8275 31 3128.500000 5737.0 4247 7499 3 4797 8271 35 1737.000000 3080.0 373 0652 3 3363 8270 36 2453.500000 2770.0 4270 7531 3 1769 8266 40 3248.500000 5979.0 3115 5501 4 1468 8253 53 2261.666667 4241.0 1929 3397 3 2163 8251 55 3044.000000 5574.0 4003 7050 3 1480 8244 62 3382.000000 5505.0 2480 4360 4 2595 8233 73 1879.333333 2829.0 3165 5608 3 3870 8229 77 2179.500000 3007.0)
python# gap per digit overall (digit appearing anywhere in number), with occurrence positions expanded perhaps digit_occ=[] for idx, num in zip(long['seq_idx'], long['num']): for pos,d in enumerate(num, start=1): digit_occ.append((idx,pos,d)) digit_occ_df=pd.DataFrame(digit_occ, columns=['seq_idx','pos','digit']) digit_gap_overall=[] for d, grp in digit_occ_df.groupby('digit'): uniq_idx=grp['seq_idx'].drop_duplicates().sort_values().values gaps=np.diff(uniq_idx) digit_gap_overall.append({ 'digit':d, 'count':len(grp), 'numbers_containing_digit_count': len(uniq_idx), 'current_gap_slots': int(long['seq_idx'].max()-uniq_idx.max()), 'avg_gap_slots': gaps.mean() if len(gaps)>1 else np.nan, 'max_gap_slots': gaps.max() if len(gaps)>1 else np.nan, }) digit_gap_overall=pd.DataFrame(digit_gap_overall).sort_values('current_gap_slots', ascending=False) digit_gap_overall digit count ... avg_gap_slots max_gap_slots 2 2 3347 ... 2.829526 27 1 1 3420 ... 2.792802 17 3 3 3283 ... 2.911961 22 9 9 3292 ... 2.874610 19 8 8 3280 ... 2.893691 19 6 6 3317 ... 2.883681 23 0 0 3273 ... 2.906895 19 4 4 3356 ... 2.838974 18 5 5 3284 ... 2.882986 19 7 7 3376 ... 2.827375 22 [10 rows x 6 columns]
pythondigit_gap_overall digit count ... avg_gap_slots max_gap_slots 2 2 3347 ... 2.829526 27 1 1 3420 ... 2.792802 17 3 3 3283 ... 2.911961 22 9 9 3292 ... 2.874610 19 8 8 3280 ... 2.893691 19 6 6 3317 ... 2.883681 23 0 0 3273 ... 2.906895 19 4 4 3356 ... 2.838974 18 5 5 3284 ... 2.882986 19 7 7 3376 ... 2.827375 22 [10 rows x 6 columns]
pythondigit_gap_overall.to_string(index=False) 'digit count numbers_containing_digit_count current_gap_slots avg_gap_slots max_gap_slots\n 2 3347 2934 7 2.829526 27\n 1 3420 2974 3 2.792802 17\n 3 3283 2852 3 2.911961 22\n 9 3292 2888 1 2.874610 19\n 8 3280 2870 1 2.893691 19\n 6 3317 2881 1 2.883681 23\n 0 3273 2858 0 2.906895 19\n 4 3356 2926 0 2.838974 18\n 5 3284 2881 0 2.882986 19\n 7 3376 2938 0 2.827375 22'
pythonpos_gap_records=[] for pos in range(1,5): for d, grp in digit_occ_df[digit_occ_df['pos']==pos].groupby('digit'): idxs=grp['seq_idx'].sort_values().values gaps=np.diff(idxs) pos_gap_records.append({ 'pos':pos,'digit':d,'count':len(idxs), 'current_gap_slots':int(long['seq_idx'].max()-idxs.max()), 'avg_gap_slots':gaps.mean() if len(gaps)>1 else np.nan, 'max_gap_slots':gaps.max() if len(gaps)>1 else np.nan, }) pos_gap=pd.DataFrame(pos_gap_records) # For each position, find most overdue digit pos_overdue=pos_gap.sort_values(['pos','current_gap_slots'], ascending=[True,False]).groupby('pos').head(3) pos_overdue pos digit count current_gap_slots avg_gap_slots max_gap_slots 6 1 6 830 28 9.985525 69 2 1 2 845 13 9.815166 66 7 1 7 833 10 9.968750 81 14 2 4 794 35 10.421185 62 13 2 3 838 23 9.893668 73 19 2 9 810 11 10.245983 63 29 3 9 865 27 9.570602 58 21 3 1 894 20 9.278835 62 26 3 6 785 15 10.567602 60 33 4 3 797 27 10.394472 57 39 4 9 810 15 10.228677 81 36 4 6 847 11 9.803783 71
pythonlast2_gap=gap_stats_for_series(long['last2'].values, long['seq_idx'].values) # For last2 count above expected? 8307/100=83.1; top overdue with count >=80 maybe? Current_gap high. overdue_last2=last2_gap[last2_gap['count']>=80].sort_values('current_gap', ascending=False).head(10) recent_last2=last2_gap[last2_gap['count']>=80].sort_values('current_gap').head(10) overdue_last2, recent_last2 ( value count first_idx last_idx current_gap avg_gap max_gap 63 63 83 243 7680 626 90.695122 360 53 53 88 23 7953 353 91.149425 349 41 41 83 2 7955 351 96.987805 517 28 28 95 4 7971 335 84.755319 352 67 67 81 93 7994 312 98.762500 359 95 95 86 18 8000 306 93.905882 423 51 51 89 281 8001 305 87.727273 568 16 16 89 102 8036 270 90.159091 435 59 59 81 40 8059 247 100.237500 351 10 10 92 248 8064 242 85.890110 331, value count first_idx last_idx current_gap avg_gap max_gap 57 57 84 36 8306 0 99.638554 393 84 84 95 320 8305 1 84.946809 491 70 70 96 99 8304 2 86.368421 593 4 04 85 403 8302 4 94.035714 412 45 45 82 211 8301 5 99.876543 343 21 21 84 46 8299 7 99.433735 381 2 02 95 71 8297 9 87.510638 506 76 76 121 13 8295 11 69.016667 354 8 08 85 28 8294 12 98.404762 524 87 87 92 47 8293 13 90.615385 580)
python# Time slot summary across all outputs slot_summary = long.groupby('Waktu').agg( entries=('num','size'), unique_numbers=('num','nunique'), repeat_rate=('has_repeat','mean'), palindrome_rate=('palindrome','mean'), avg_digit_sum=('sum_digits','mean'), even_digit_rate=('num', lambda s: np.mean([int(d)%2==0 for x in s for d in x])), high_digit_rate=('num', lambda s: np.mean([int(d)>=5 for x in s for d in x])) ).reset_index() slot_summary['repeat_pct']=slot_summary['repeat_rate']*100 slot_summary['palindrome_pct']=slot_summary['palindrome_rate']*100 slot_summary['even_digit_pct']=slot_summary['even_digit_rate']*100 slot_summary['high_digit_pct']=slot_summary['high_digit_rate']*100 slot_summary Waktu entries ... even_digit_pct high_digit_pct 0 03:30 1404 ... 50.196 49.573 1 06:30 1404 ... 50.142 49.359 2 11:30 1391 ... 48.203 49.928 3 13:30 1313 ... 50.114 51.314 4 17:30 1391 ... 50.521 49.928 5 21:30 1404 ... 50.089 48.825 [6 rows x 12 columns]
pythonslot_summary[['Waktu','entries','unique_numbers','repeat_pct','palindrome_pct','avg_digit_sum','even_digit_pct','high_digit_pct']].round(2).to_string(index=False) 'Waktu entries unique_numbers repeat_pct palindrome_pct avg_digit_sum even_digit_pct high_digit_pct\n03:30 1404 1315 45.01 0.93 17.88 50.20 49.57\n06:30 1404 1304 46.15 0.71 17.96 50.14 49.36\n11:30 1391 1297 49.39 0.86 17.86 48.20 49.93\n13:30 1313 1227 43.41 0.69 18.36 50.11 51.31\n17:30 1391 1290 43.13 0.86 17.94 50.52 49.93\n21:30 1404 1305 46.37 1.00 17.79 50.09 48.82'
pythonlong['month']=long['date'].dt.strftime('%Y-%m') monthly=long.groupby('month').agg(entries=('num','size'), repeat_pct=('has_repeat', lambda s:s.mean()*100), avg_sum=('sum_digits','mean'), high_pct=('num', lambda s: np.mean([int(d)>=5 for x in s for d in x])*100), even_pct=('num', lambda s: np.mean([int(d)%2==0 for x in s for d in x])*100)).reset_index() monthly.round(2) month entries repeat_pct avg_sum high_pct even_pct 0 2026-02 1417 43.54 17.93 49.51 50.04 1 2026-03 2340 46.20 17.90 49.70 49.33 2 2026-04 2340 43.80 18.00 49.84 50.40 3 2026-05 2210 48.19 18.00 50.07 49.81
pythonimport scipy.stats as stats chisq, p = stats.chisquare(digit_counts.values) chisq, p (np.float64(6.726134585289514), np.float64(0.6656082633719298))
pythonpos_chi=[] for pos in range(1,5): counts=pos_freq_df[f'pos{pos}'].values ch,pv=stats.chisquare(counts) pos_chi.append((pos,ch,pv)) pos_chi [(1, np.float64(5.38593956903816), np.float64(0.7994514088230359)), (2, np.float64(6.67882508727579), np.float64(0.6705180800135313)), (3, np.float64(15.627904177199952), np.float64(0.07507039093592495)), (4, np.float64(6.666787047068737), np.float64(0.6717665401665336))]
pythonfirst_unique=df['1st Prize'].nunique() first_repeated=(prize1_counts>1).sum() first_max=prize1_counts.max() first_unique, len(df), first_repeated, first_max (618, 639, np.int64(21), np.int64(2))
pythonoverall_unique=long['num'].nunique() overall_unique, len(long), len(exact_counts[exact_counts==1]), len(exact_counts[exact_counts>=2]), exact_counts.max() (5637, 8307, 3611, 2026, np.int64(6))
pythonn=len(long) obs_repeat=long['has_repeat'].sum() p0=1- (10*9*8*7)/(10**4) z=(obs_repeat - n*p0)/math.sqrt(n*p0*(1-p0)) pval_two=2*(1-stats.norm.cdf(abs(z))) p0, obs_repeat, obs_repeat/n, z, pval_two (0.496, np.int64(3788), np.float64(0.4560009630432166), np.float64(-7.29147739706747), np.float64(3.0664359940146824e-13))
pythondef fmt_gap_table(df_gap, value_col='value', n=8): out=df_gap.head(n).copy() out['gap_output_slots']=out['current_gap'] out['approx_draw_gap']=out['current_gap']/13 return out[[value_col,'count','gap_output_slots','approx_draw_gap','avg_gap','max_gap']] fmt_overdue_exact=fmt_gap_table(overdue_exact, n=8) fmt_recent_exact=recent_exact.head(8).copy() fmt_recent_exact['gap_output_slots']=fmt_recent_exact['current_gap'] fmt_recent_exact['approx_draw_gap']=fmt_recent_exact['current_gap']/13 fmt_recent_exact = fmt_recent_exact[['value','count','gap_output_slots','approx_draw_gap']] fmt_overdue_exact.round( ( value count gap_output_slots approx_draw_gap avg_gap max_gap 1420 2500 3 7275 559.6 511.0 646.0 3852 6814 3 6388 491.4 255.0 461.0 24 0042 3 6331 487.0 884.0 1577.0 680 1206 3 6267 482.1 627.5 699.0 4596 8146 3 6177 475.2 160.5 220.0 4330 7636 3 6176 475.1 708.5 1232.0 2889 5108 3 6155 473.5 367.5 435.0 529 0946 3 5999 461.5 978.0 1867.0, value count gap_output_slots approx_draw_gap 2306 4057 4 0 0.0 5465 9684 3 1 0.1 364 0638 3 6 0.5 963 1702 3 9 0.7 2564 4509 3 16 1.2 3349 5912 3 24 1.8 632 1122 3 31 2.4 4247 7499 3 35 2.7)
pythonoverdue_last2_fmt=overdue_last2.head(8).copy() overdue_last2_fmt['approx_draw_gap']=overdue_last2_fmt['current_gap']/13 overdue_last2_fmt[['value','count','current_gap','approx_draw_gap','avg_gap','max_gap']].round(1) value count current_gap approx_draw_gap avg_gap max_gap 63 63 83 626 48.2 90.7 360 53 53 88 353 27.2 91.1 349 41 41 83 351 27.0 97.0 517 28 28 95 335 25.8 84.8 352 67 67 81 312 24.0 98.8 359 95 95 86 306 23.5 93.9 423 51 51 89 305 23.5 87.7 568 16 16 89 270 20.8 90.2 435
python# top digit per position and low per position top_pos=[] for pos in range(1,5): counts=pos_freq_df[f'pos{pos}'] top=counts.sort_values(ascending=False).head(3) low=counts.sort_values(ascending=True).head(3) top_pos.append({ 'pos':pos, 'top3': ', '.join([f"{d} ({c}, {c/len(long)*100:.2f}% of numbers)" for d,c in top.items()]), 'low3': ', '.join([f"{d} ({c})" for d,c in low.items()]) }) top_pos [{'pos': 1, 'top3': '4 (861, 10.36% of numbers), 3 (861, 10.36% of numbers), 2 (845, 10.17% of numbers)', 'low3': '8 (799), 5 (803), 9 (807)'}, {'pos': 2, 'top3': '5 (867, 10.44% of numbers), 2 (862, 10.38% of numbers), 6 (855, 10.29% of numbers)', 'low3': '4 (794), 8 (807), 9 (810)'}, {'pos': 3, 'top3': '1 (894, 10.76% of numbers), 7 (883, 10.63% of numbers), 9 (865, 10.41% of numbers)', 'low3': '6 (785), 3 (787), 5 (805)'}, {'pos': 4, 'top3': '4 (873, 10.51% of numbers), 1 (851, 10.24% of numbers), 8 (851, 10.24% of numbers)', 'low3': '3 (797), 5 (809), 9 (810)'}]
Saya menjalankan 5 jenis analisis pada data_history.txt: 639 record draw, 13 angka per record, total 8.307 angka 4D, periode 10 Feb 2026 21:30 sampai 29 Mei 2026 06:30. (data_history.txt)
Catatan penting: ini analisis statistik historis, bukan prediksi pasti. Jika proses draw acak, hasil masa lalu tidak menjamin hasil berikutnya.
Angka 4D paling sering muncul di semua kolom hadiah:
| Angka | Frekuensi | Persentase |
|---|---|---|
| 3462 | 6 | 0,072% |
| 7341 | 6 | 0,072% |
| 7359 | 5 | 0,060% |
| 3744 | 5 | 0,060% |
| 2895 | 5 | 0,060% |
| 8294 | 5 | 0,060% |
| 4796 | 5 | 0,060% |
| 8218 | 5 | 0,060% |
| 9178 | 5 | 0,060% |
| 7048 | 5 | 0,060% |
Untuk 1st Prize saja, dari 639 draw ada 618 angka unik. Angka 1st Prize yang berulang hanya muncul maksimal 2 kali, jadi sinyal “angka panas” untuk 1st Prize sangat lemah.
Frekuensi digit total:
| Digit | Frekuensi | Persentase |
|---|---|---|
| 1 | 3.420 | 10,29% |
| 7 | 3.376 | 10,16% |
| 4 | 3.356 | 10,10% |
| 2 | 3.347 | 10,07% |
| 6 | 3.317 | 9,98% |
| 9 | 3.292 | 9,91% |
| 5 | 3.284 | 9,88% |
| 3 | 3.283 | 9,88% |
| 8 | 3.280 | 9,87% |
| 0 | 3.273 | 9,85% |
Distribusi digit cukup seimbang. Uji chi-square terhadap distribusi seragam memberi p ≈ 0,666, jadi tidak ada deviasi kuat dari pola 10% per digit.
Gap dihitung sebagai jumlah slot angka 4D sejak kemunculan terakhir. Karena tiap record punya 13 angka, 130 slot kira-kira setara 10 record draw.
Angka berulang yang paling lama belum muncul lagi:
| Angka | Total Muncul | Gap Slot | Kira-kira Gap Record |
|---|---|---|---|
| 2500 | 3 | 7.275 | 560 |
| 6814 | 3 | 6.388 | 491 |
| 0042 | 3 | 6.331 | 487 |
| 1206 | 3 | 6.267 | 482 |
| 8146 | 3 | 6.177 | 475 |
| 7636 | 3 | 6.176 | 475 |
| 5108 | 3 | 6.155 | 474 |
| 0946 | 3 | 5.999 | 462 |
2 digit belakang yang relatif “overdue”:
| 2D Belakang | Frekuensi | Gap Slot | Kira-kira Gap Record |
|---|---|---|---|
| 63 | 83 | 626 | 48 |
| 53 | 88 | 353 | 27 |
| 41 | 83 | 351 | 27 |
| 28 | 95 | 335 | 26 |
| 67 | 81 | 312 | 24 |
| 95 | 86 | 306 | 24 |
| Pola | Hasil |
|---|---|
| Semua digit unik | 54,40% |
| Ada digit berulang | 45,60% |
| Semua digit sama, contoh 1111 | 0,096% |
| Palindrome, contoh 1221 | 0,843% |
| Urutan naik murni, contoh 1234 | 7 kejadian |
| Urutan turun murni, contoh 4321 | 6 kejadian |
Temuan menarik: jika digit benar-benar acak independen, angka dengan digit berulang biasanya sekitar 49,6%. Di data ini hanya 45,6%, jadi angka dengan 4 digit unik sedikit lebih dominan.
Pola jumlah digit paling sering:
| Jumlah Digit | Frekuensi | Persentase |
|---|---|---|
| 17 | 575 | 6,92% |
| 19 | 574 | 6,91% |
| 18 | 563 | 6,78% |
| 16 | 529 | 6,37% |
| 15 | 529 | 6,37% |
| 20 | 508 | 6,12% |
Rentang tengah 15–20 paling padat, sesuai distribusi alami angka 4 digit.
| Waktu | Total Angka | Repeat Digit | Rata-rata Jumlah Digit | Digit Tinggi 5–9 |
|---|---|---|---|---|
| 03:30 | 1.404 | 45,01% | 17,88 | 49,57% |
| 06:30 | 1.404 | 46,15% | 17,96 | 49,36% |
| 11:30 | 1.391 | 49,39% | 17,86 | 49,93% |
| 13:30 | 1.313 | 43,41% | 18,36 | 51,31% |
| 17:30 | 1.391 | 43,13% | 17,94 | 49,93% |
| 21:30 | 1.404 | 46,37% | 17,79 | 48,82% |
Pola waktu yang menonjol:
Secara bulanan, repeat digit naik di Mei:
| Bulan | Total Angka | Repeat Digit | Rata-rata Jumlah Digit |
|---|---|---|---|
| Feb 2026 | 1.417 | 43,54% | 17,93 |
| Mar 2026 | 2.340 | 46,20% | 17,90 |
| Apr 2026 | 2.340 | 43,80% | 18,00 |
| Mei 2026 | 2.210 | 48,19% | 18,00 |
Jika sistem draw dianggap fair:
Watchlist statistik, bukan prediksi:
| Jenis | Kandidat Berdasarkan Data |
|---|---|
| Angka paling sering | 3462, 7341, 7359, 3744, 2895 |
| 2D depan sering | 70, 68, 82, 75, 48 |
| 2D belakang sering | 76, 94, 48, 19, 13 |
| 2D belakang overdue | 63, 53, 41, 28, 67 |
| Pola dominan | 4 digit unik, jumlah digit 15–20 |
| Slot waktu menarik | 13:30 cenderung digit lebih tinggi; 11:30 repeat digit lebih tinggi |
Kesimpulan utamanya: digit keseluruhan cukup merata, angka 4D exact terlalu tersebar untuk prediksi kuat, tetapi pola yang paling konsisten adalah angka dengan digit unik lebih sering daripada angka dengan digit berulang dalam dataset ini.