-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathDone.txt
973 lines (629 loc) · 46.8 KB
/
Done.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
; Done items
Added: Given I am using C64 processor port options
This stops writes to the IO registers from going to RAM
* Add emulated BombJack display
https://github.com/martinpiper/BombJack/blob/master/README.md
Sprites notes
0x = far left visible screen pixel
Done 0y = 16x16 Top of the sprite just below the bottom edge of the visible screen
Done 0y = 32x32 Middle horizontal part of the sprite just below the bottom edge of the visible screen
Add 32x32 sprites 9a00-9a01 (largeSprite)
Done - Check the hardware for how the inclusive register really works with regards to using sprite update slots
0-3 No 32x32
0-4 Sprite 0 is 32x32, sprite 1 is not drawn, sprites 2 onwards are drawn
0-5 Sprite 0,2 is 32x32, sprite 1,3 is not drawn, sprites 4 onwards are drawn
0-6 Sprite 0,2,4 is 32x32, sprite 1,3,5 is not drawn, sprites 6 onwards are drawn
0-7 Sprite 0,2,4,6 is 32x32, sprite 1,3,5,7 is not drawn, sprites 8 onwards are drawn
1-4 Sprite 0 is 32x32, sprite 1 is not drawn, sprites 2 onwards are drawn
2-4 Same
3-4 Same
4-4 No 32x32
4-5 Sprites 0-1 are normal, Sprite 2 is 32x32, Sprite 3 is not drawn, sprites 4 onwards are drawn
5-4 Sprites 0-1 are normal, Sprite 2 is 32x32, Sprite 3 is not drawn, sprites 4 onwards are drawn
5-5 No 32x32
6-5 Sprites 0-3 are normal, Sprite 4 is 32x32, Sprite 5 is not drawn, sprites 6 onwards are drawn
7-5 Sprites 0-3 are normal, Sprite 4,6 is 32x32, Sprite 5,7 is not drawn, sprites 8 onwards are drawn
1-5 Sprite 0,2 is 32x32, sprite 1,3 is not drawn, sprites 4 onwards are drawn
2-5 Same
Does it skip the next sprite?
Yes
What about the odd numbers sprites for the ranges?
No, note the lines into 5R 5S
Also the sprite frame, does it multiply the sprite frame to ensure it is aligned?
Yes, it multiplies by 4
What is the sprite layout?
01
23
Do sprites off the far right appear on the left?
Yes
Do sprites off the bottom appear on the top?
No.
Done - Add full height mode (fullHeightSprite)
* Emulated hardware display
When allocating layers, make the address and addressEx configurable
TODO: Address configs for addressEx=0x01 need to respect the INRAMSEL1 groups 0x8000 0x8800 0x9000 0x9800 ... 0xb800 as these used for selection on each board
Video board uses fixed address, not configurable.
* To correctly handle the "lo ENABLEPIXELS (with no border flags)" at 0x189H
And to handle the hardware plane shifting, which introduces an 8 pixel delay
The tiles and chars need to latch the Y position at the start of their reads for each tile/char 8 pixel span
* Sprites also needed this fix, actually it was the sprite span pixel read and clear that gave the hint about the latch
* See: latchedDisplayV
* The mode7 layer does not have this problem because it maintains its own pixel clocks from hsync and vsync
* Continue adding syntax to support the other video layers
features\TestVideoHardware.feature
* Detect when the 24 bit bus is held with an address for an extended number of cycles without data being written
No need now, the hardware automatically resets the bus a while after the byte is written if another write has not been started
* displayBombJack.isVsyncTriggered
The $dd0d value is reading the logic level of vsync in the display, it technically doesn't need the reset logic
However the vsync wait will need a wait for "not vync" if a wait for vsync is issued too quickly after the first wait
* Optional syntax to limit the speed to XX FPS for video display
* Add arrows+ctrl as joystick1 ($dc00) for the video display window
* Output the software bus writes so that the Proteus simulator can use. Will need to detect waits for vsync or log x/y pos for writes
* Using target/debugData.txt the real hardware isn't displaying the high mode7 characters?
* Mode7Regs2Size and Mode7Regs3Size are wrong, causing strange memory overwrites
* Verified, by filling map.bin, that tile $02 and $22 are actually rendered
* The palette used in the tile was the same as the background colour, so it was "invisible".
* However the palette was being written strangely by the contention test code
* Twiddle the mode7 HV flips to align them with tiles
* The java video hardware test mode output debug file does not quite render correctly on the hardware
The raster bars and perspective mode7 part is alternating every frame. Probably something to do with the output displayV not quite catching up properly
Investigate if the "w$ff01ff00,$ff019000" is not being caught correctly
It seems waiting for the last line and trying to do all the updates it not working out timing wise
Removing the wait for line $ff and moving the sprite update to just after the vsync wait worked
* TODO: Investigate, do we want to want for -ve _VSYNC or -ve VBLANK as it is currently going into the digital data simulator?
Look at when the better, earlier, timing is just off the bottom of the screen
VBLANK is better
* The BDD6502 emulation will need to align the wait to vsync or vblank of course, to match with what the simulator does
* _VBLANK checking aligned with simulation and emulation
* Extract memory bus interface for future expansion
* Add audio expansion prototype
First single channel sound is working at the correct sample rate
Multiple channel is also working
* Attempt to convert a small mod/xm
http://www.retrospekt.com.au/2020/05/tiny-music-a-massive-curated-collection-of-music-in-mod-xm-s3m-other-formats/
https://www.dropbox.com/sh/yyxyrkin9uj76ie/AABYa381WWs8KXwsIIYo4_q7a?dl=0
Need to find an up to 8 channel file that will fit within 64K for samples (without down sampling)
Without too many complex voice controls would also help
Potential music:
**4 channels: C:\Users\Martin Piper\Downloads\tiny music\mods\artists\h0ffman\H0ffman - Freerunner.mod
4 channels: C:\Users\Martin Piper\Downloads\tiny music\mods\artists\mortimer twang\Mortimer Twang - No Sellout.mod
4 channels: C:\Users\Martin Piper\Downloads\tiny music\mods\artists\mygg\Mygg - Techno Focus.mod
8 channels: C:\Users\Martin Piper\Downloads\tiny music\mods\artists\zabutom\Zabutom - Godsends.xm
*4 channels: C:\Users\Martin Piper\Downloads\tiny music\mods\various\4Mat - One Bullet Symphony.xm
Potential java libs for parsing and converting to semi-optimised 6502 suitable format:
* http://www.javamod.de/
http://www.javamod.de/javamod.html
cd /d C:\Users\Martin Piper\Downloads\
java -cp ./javamod.jar de.quippy.javamod.main.CommandLine "C:\Users\Martin Piper\Downloads\tiny music\mods\artists\h0ffman\H0ffman - Freerunner.mod"
ProTracker mod with 31 samples and 4 channels using Protracker frequency table
Loader code:
C:\Users\Martin Piper\Downloads\javamod-source\source\de\quippy\javamod\multimedia\mod\loader\tracker\ProTrackerMod.java
Player code:
C:\Users\Martin Piper\Downloads\javamod-source\source\de\quippy\javamod\multimedia\mod\ModMixer.java
startPlayback()
Might be able to stub out openAudioDevice() and writeSampleDataToLine() and let it parse the file to get info from modMixer.mixIntoBuffer(LBuffer, RBuffer, bufferSize);
C:\Users\Martin Piper\Downloads\javamod-source\source\de\quippy\javamod\multimedia\mod\mixer\BasicModMixer.java
mixIntoBuffer(final int[] leftBuffer, final int[] rightBuffer, final int count)
mixChannelIntoBuffers(final int[] leftBuffer, final int[] rightBuffer, final int startIndex, final int endIndex, final ChannelMemory actMemo)
Uses: C:\Users\Martin Piper\Downloads\javamod-source\source\de\quippy\javamod\multimedia\mod\mixer\ProTrackerMixer.java
C:\Users\Martin Piper\Downloads\javamod-source\source\de\quippy\javamod\multimedia\mod\mixer\BasicModMixer.java
ChannelMemory
doTickEvents()
Potentially more interesting for extracting note events
Yes definitely a lot more interesting, might even be possible to just call this in a loop to quickly get all the info
C:\Users\Martin Piper\Downloads\javamod-source\source\de\quippy\javamod\multimedia\mod\mixer\ProTrackerMixer.java
doRowEffects()
Some options for export here:
mixIntoBuffer
Can export all relevant changes in ChannelMemory
By modifying mixChannelIntoBuffers()
* First pass of importing music works, repeating samples are needed, many volume and pitch updates etc are missing.
The sample frequency conversion (// Convert internal frequency to hardware values) seems to be correct
java -cp ./javamod.jar de.quippy.javamod.main.CommandLine "C:\Users\Martin Piper\Downloads\_nice_outfit_.mod"
* AudioExpansion could probably do with a loop address and loop length to be used after the first values
Could use a flip-flop to hold the state, which is reset when high length is written. This obviously adds complexity though.
* Add kMusicCommandDefineSample with index, then remove duplicate sample data from kMusicCommandPlayNote
Currently: "C:\Users\Martin Piper\Downloads\asikwp_-_twistmachine.mod"
3 m 20s in 127,487 bytes
After writing common sample data with kMusicCommandSetSampleData: 61,766 bytes
* Added audio hardware syntax and test code
The MemoryBus architecture is also expansion aware.
* Continue MusicPoll with jsr DecompressMusic_GetNextByte
Process the music events kMusicCommandWaitFrames etc
* Find out why when running from jar the video display slows down.
It's almost like not enough pixels get calculated?
* Print the number of instructions elapsed per video display frame
Hmm Oracle java is slowing down, but this java does not: "C:\Users\Martin Piper\.jdks\corretto-1.8.0_252\bin\java.exe"
https://aws.amazon.com/corretto/faqs/ is better performance than Oracle java
Oracle java sucks
1>Rendered FPS = 48 frameDelta = 535 period=1001 instructionsThisPeriod=610032 instructionsPerFrame=10157 instructionsShortfall=2515
instructionsThisPeriod=610032
number of emulated 6502 instructions in 1001 milliseconds
so .6 MHz
same jar with Corretto java:
1>Rendered FPS = 60 frameDelta = 1 period=1001 instructionsThisPeriod=3463955 instructionsPerFrame=57674 instructionsShortfall=-45002
5.6 times faster
* QuickDrawPanel::fastSetRGB() created to optimise image drawing further, this is to fix slowdown when using Oracle java compared to the much faster Corretto
* Output raw sample bytes for debug playback and comparison with the hardware
e.g. "C:\Users\Martin Piper\Downloads\ffmpeg-20200422-2e38c63-win64-static\ffmpeg-20200422-2e38c63-win64-static\bin\ffplay.exe" -f u8 -channels 2 -ar 25000 c:\work\C64\VideoHardware\target\debugchannel.pcmu8
e.g. "C:\Users\Martin Piper\Downloads\ffmpeg-20200422-2e38c63-win64-static\ffmpeg-20200422-2e38c63-win64-static\bin\ffplay.exe" -f u8 -channels 2 -ar 25000 C:\Work\BDD6502\target\debugchannel.pcmu8
* Emulation will need to match hardware design which now has stereo output
* Force plane ordering, including a plane that is configured to output a specific pixel colour (to emulate poking wires into the header)
* Tile layer background pixel transparency check. Layer should use specific plane.
* In the hardware simulation: If the sprites layer is the last layer, the sprite's x position seems to update the colour information for the whole vertical strip
This makes sense since the full height sprite flag has consistent colour and comes from the colour read.
The bit planes shifters wil be emitting zeros since they will not load the sprite data based on the vertical position test.
The emulation needs to be updated to reflect this. Currently the emulation does not write colour inforation if the sprite vertical portion is out of range.
* Full height sprites with 32x32 mode enabled, should not display repeated 32x32 sprite data chunks, they should show 32x16 sprite data
According to the simulation from bus data output by features\TestVideoHardware Chars Sprites.feature
* spriteIndexReJig is used to adjust the sprite reading schedule to that of the hardware
* Chars V4.0 emulation update to match schematic
* Emulate C64 timer to allow the video clock (12.096M) to drive the timer clocks
A use case is where the C64 sets a timer to count the vertical raster position after the vsync signal is detected
Pixel clock = 12.096M / 2 = 6048000 Hz
The line length is 384 pixels, which gives 15750 Hz or 0.063492063492063 ms or 6.349206349206349e-5 seconds
C64 CyclesPerSecondPALC64 = 985248 Hz
Which means 985248 (clocks per second) / 15750 (line length from video) = 62.56 clocks per line
* Added Video_StartRasterTimers
* Easiest route will be to have a faked timer for the emulation that is setup to provide values like those in the hardware, but ignores the C64 CIA timer code setup values
* Moved display enable/border/priority registers to video generation logic
* Handle 16 colour mode for new hardware revision
* Adding an APU means "a user port to 24 bit bus is installed" and "a new audio expansion" will need to cooperate since it's possible the APU drive the sound board
This means all display/audio layers will need to get their memory from the APU, not the user port
* The audio and display layer syntax can be re-ordered to be after the user port and APU syntax
* com.bdd6502.DisplayBombJack.calculatePixel will need a callback to an optional APU
* Tidy up all the "gotByte & 0xff" rubbish. Make it an accessor that returns an int ready for registers to use
* APU: Debug display, configurable via syntax (and property) to display
Instruction hex, binary flags, text flags
PC, registers, and small memory dump for the various address registers
* de.quippy.javamod.multimedia.mod.ModMixer.fastExport add option to scale the samples (and their frequencies) to fit them into a target memory size (like 64K)
Will need a lower sample threshold size to avoid shrinking "chip sound" samples
de.quippy.javamod.multimedia.mod.mixer.BasicModMixer.exportSample
The sample scaling factor can also be calculated in here since exportSample() is called before the first usage of this sample
Then kMusicCommandPlayNote can include the scaling factor
* Improved music data compression with intensive searching
* com.bdd6502.CompressData add option to search for the best "bestLen > 6" value to use, since there can be better savings when using a longer threshold.
* Syntax for "add a Chars V4.0 layer" needs to include scrolling registers, tests will need to use new data. Create a larger source image for this and create new output data.
Old small screen image conversion: oldbridge char screen with rgbfactor
TODO: Need to emulate the bad char definition reads with certain scroll values
* Various music conversion command lines. File from https://modarchive.org/ :
* Volume only? Not for this tune
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\cream_of_the_earth.mod" "target/exportedMusic" 65535 100626
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\lotus2-title.mod" "target/exportedMusic" 7 8
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\intro_01.mod" "target/exportedMusic" 7 8
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\turrican.mod" "target/exportedMusic" 65535 88560
* Problems with repeating? Or new note detection? newInstrumentSet
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\blood_money_title.mod" "target/exportedMusic" 65535 155920
* Fixed: The "aktMemo.newInstrumentSet = true;" needed to be inside the "element.getInstrument()>0" check. Doh!
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\speedbal.mod" "target/exportedMusic" 65535 151832
* Volume only? SotB fixed by kMusicCommandAdjustVolume. But it does cause large files and lots of clicking. Needs a better "instrument active" check before it can be used.
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\shadowofthebeast.mod" "target/exportedMusic" 1 1
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\outrun_intro.mod" "target/exportedMusic" 1 3
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\Shadow of the Beast (1)\Beast1_2.mod" "target/exportedMusic" 65535 71268
java -Dmusic.volume=1 -jar target\BDD6502-1.0.9-SNAPSHOT-jar-with-dependencies.jar --exportmod "C:\Users\Martin Piper\Downloads\xenon2.mod" "target/exportedMusic" 65535 325226
* Added music conversion detection of frequency change, which vastly improves playback
* Music conversion, just volume change detection needed?
* Merge event and channel
Channel low nybble
Event high nybble
* Excessive clicking when starting a voice fixed. Funnily enough this wasn't in the hardware, but was a fault of the emulation
Validated in hardware and emulation
* Music conversion: "C:\Users\Martin Piper\Downloads\SOTB_Musiques_Amiga\SOTB - TITLE.mod"
For some reason it isn't exporting even though this does play:
java -cp "C:\Users\Martin Piper\Downloads\javamod.jar" de.quippy.javamod.main.CommandLine "C:\Users\Martin Piper\Downloads\SOTB_Musiques_Amiga\SOTB - TITLE.mod"
* Not needed, since layer order register is included in V5.0
* To use the VideoHardware debug output, the hardware last two layers need to be swapped.
Perhaps add a switch selector for different layer priorities into the layer select to allow runtime configuration?
Extra (but disabled and not placed) logic has been added to the video layer
* Remove System.getProp* usage inside time critical routines
* Create 6502 remote debug TCP interface. Enough for the VICEPDBMonitor to work.
When enable remote debugging
And wait for debugger connection
And wait for debugger command
* Improved remote debug single stepping
* Added APU register remote debug output, if it's enabled
* Remote debugger commands like next / step / return do not need to wait for the textual reply before completing the command
The replies happen when the CPU next stops when sendDebuggerUpdate is true
* kAPU_SkipIfEQ needs to check the data select is stable from the previous cycle to ensure the logic operates on a stable value
* Need kAPU_InternalMEWR in BDD6502
* Add remote debugger single step option for APU
Perhaps a command to switch between 6510 CPU and APU for the debugger next/step/etc commands
Code reorganised "handleSuspendLoop(remoteDebugger , RemoteDebugger.kDeviceFlags_CPU);" to allow kDeviceFlags_CPU and kDeviceFlags_APU to suspend execution
* This will allow the APU to have "step out" to run until the next matching waitHV
* "Step over" and "step in" are basically the same as there are no JSR equivalent instructions
Note in the glue code "remoteDebugger.isCurrentDevice(RemoteDebugger.kDeviceFlags_CPU)", using kDeviceFlags_APU will let the APU detect when a command is intended for it instead of the CPU
Commands to switch APU and 6510:
cpu apu
cpu 6502
* Reset frame catch up times while debugging
Disable sound output
* While debugging with steps have the option of clearing the screen data from the current pixel position to show the newly rendered data in the display
Or highlight the current position in the window with the previous frame data
Or clear the entire display data
Done - Hitting the next step (or break) force redraws the window, basically
* When disassembling, or dumping, take into account the current device and provide APU information when needed
Done APU disassembly
Done: Disassembly address in APU mode needs to use the APU PC, not the 6502 PC
Note the "before address" fetched can be smaller that for 6502
Also the source disassembly can use the label APUData_Start as an offset if it is available. Filter by the label's zone when in APU mode?
Allow an optional offset in the debugger to support multiple chunks of APU code as source, or filter by the label's zone?
* displayV changes in simulation at displayH=0x180
* This is also at MSB displayH, of course
* Fix emulation to reflect this (it changes at 0x188 currently)
* Look for "// if (displayH == 0x180) {"
* Fix Sprite2 logic
* However, another issue is that at 0x180 the real hardware displays the last eight pixels of the previous line...
* Schematic will need a compatator for 0x188, could make it a latch in the register space and variable with a comparator...
* Could also use a configurable "use start line" value for the span draw. This would be useful for shifting the horizontal "origin" of the screen
* Done: If the write pixel logic does the same kind of transparency test as the regular sprites (read, test, then optional write / re-write value) this would allow the highest priority sprites to be drawn first and avoid dropout issues better...
* Rotated Sprites2
double tempRot = Math.toRadians(33.0f);
int rotpixelX = (int)(((double)pixelX * Math.cos(tempRot)) - ((double)pixelY * Math.sin(tempRot)));
int rotpixelY = (int)(((double)pixelX * Math.sin(tempRot)) + ((double)pixelY * Math.cos(tempRot)));
pixelX = rotpixelX;
pixelY = rotpixelY;
if (pixelX >= 0 && pixelX < 32 && pixelY >= 0 && pixelY < 32) {
// Rotation with origin correction
int pixelX = (currentSpriteXPixel >> 5) & 0x1f;
int pixelY = (currentSpriteYPixel >> 5) & 0x1f;
pixelX = (currentSpriteXPixel >> 3) & 0x7f;
pixelY = (currentSpriteYPixel >> 3) & 0x7f;
double tempRot = Math.toRadians(display.getFrameNumberForSync());
// Note, reverse the rotation for the origin
pixelX += 64.0f * Math.cos(-tempRot) - 64.0f * Math.sin(-tempRot);
pixelY += 64.0f * Math.sin(-tempRot) + 64.0f * Math.cos(-tempRot);
pixelX -= 64.0f;
pixelY -= 64.0f;
int rotpixelX = (int)(((double)pixelX * Math.cos(tempRot)) - ((double)pixelY * Math.sin(tempRot)));
int rotpixelY = (int)(((double)pixelX * Math.sin(tempRot)) + ((double)pixelY * Math.cos(tempRot)));
pixelX = rotpixelX >> 2;
pixelY = rotpixelY >> 2;
if (pixelX >= 0 && pixelX < 32 && pixelY >= 0 && pixelY < 32) {
pixelX &= 0x1f;
pixelY &= 0x1f;
// Rotation with a sqrt(2) scale fix applied, to avoid sprite bounding edges
int pixelX = (currentSpriteXPixel >> 5) & 0x1f;
int pixelY = (currentSpriteYPixel >> 5) & 0x1f;
pixelX = (currentSpriteXPixel >> 3) & 0x7f;
pixelY = (currentSpriteYPixel >> 3) & 0x7f;
double tempRot = Math.toRadians(display.getFrameNumberForSync() * (drawingSpriteIndex / 2));
if ((drawingSpriteIndex & 1) == 1) {
tempRot = Math.toRadians(-display.getFrameNumberForSync() / drawingSpriteIndex);
}
final double root2 = Math.sqrt(2.0f);
// Note, reverse the rotation for the origin
pixelX += (64.0f * Math.cos(-tempRot) - 64.0f * Math.sin(-tempRot)) / root2;
pixelY += (64.0f * Math.sin(-tempRot) + 64.0f * Math.cos(-tempRot)) / root2;
pixelX -= 64.0f;
pixelY -= 64.0f;
int rotpixelX = (int)((((double)pixelX * Math.cos(tempRot)) - ((double)pixelY * Math.sin(tempRot))) * root2);
int rotpixelY = (int)((((double)pixelX * Math.sin(tempRot)) + ((double)pixelY * Math.cos(tempRot))) * root2);
pixelX = rotpixelX >> 2;
pixelY = rotpixelY >> 2;
if (pixelX >= 0 && pixelX < 32 && pixelY >= 0 && pixelY < 32) {
pixelX &= 0x1f;
pixelY &= 0x1f;
* @TC-9 created for 32x32 behaviour
* @TC-1: When using "enable APU mode" and "enable user port bus debug output" the debug output needs to include valid "wait for" information so that it can be fully tested with the APU hardware
This "wait for" information should be the same as "enable video display bus debug output" and the APU hardware output should roughly match what is observed from "enable video display bus debug output"
The APU hardware output can then be used with the video rendering
* This mostly works, however at the moment the data from data writes done in the feature file are not fully captured in the user port debug: VideoHardware\target\debugDataJustUserPort.txt
Data up to the first d$9e0001** with enable display bit $20 set will be captured in VideoHardware\target\debugData.txt
This is up to the first w$ or ^-$01 in that file
* So copy the whole chunk from there to the start of debugDataJustUserPort.txt, replacing everything up to the first w$ or ^-$01
Then use debugDataJustUserPort.txt in the APU simulation
* Then for the file BombJack\output\DebugAPUOutput.txt
The initial waits before d$9e0001** with enable display bit $20 will need to be changed from "w$" to ";w$"
Then DebugAPUOutput.txt should be able to be used in the video simulation to check APU generated output
* At the moment the video simulation with APU output seems to be roughly OK, but some of the waits seem to be missed?
Replacing all "w$ff03ff00" with "w$ff01ff00" to ignore the _VIDCLK seems to make most (not all) frames better
Also making the DigitalData generator use a faster clock rate helps
* This is most likely due to some waits appearing just before the last data write is completed
* @TC-11: Vector display layer
8K for colour, 8K for scan length
Two banks of this for displayed and back buffer screen
Will need a bank selection register
Low vsync resets the pixel count, pixel, address
Low display enable behaves like vsync
Each new line reads the colour and scan length pair in parallel, the pixel clock counts down for the scan length byte before reading a new pixel/length pair.
End of line -ve _hsync edge increments the read address
Or if the inverted length (read into the counter) increments when it reaches zero will trigger a read address increment
And new length read
And pixel colour latch
This gives a maximum theoretical scan complexity of 36 segments per scan, assuming 224 scans
The existing C64 VectorBitmap demo that calculates scans could be used for this
* Vector layer: Storing palette/length pairs needs to be in sequence $0000-$3fff or $8000-$bfff, but stored in the RAMs alternately to allow the parallel read
* @TC-11-000433.bmp
3D rendering almost works, but there seems to be some weird scan corruptions
Also seeing: com.loomcom.symon.exceptions.MemoryAccessException: uninitialised memory read: 2D75 F5 19 SBC $19,X [$00]@$0033 A:39 X:1A Y:19 F:25 S:1F3 [..-..I.C] SBC Poly2D_vertexBufferX
Poly2D_loadVertsIntoRegs: sbc Poly2D_vertexBufferX, x
x should b 0-3 at this point
** Will need to enable: I enable trace with indent
* Add automatic guard memory ranges that assert when read / write / execute operations happens
These can be read from label pairs that add buffer memory space
A macro can be used, the same label names sorted by their memory addresses represent the address pairs
Syntax to submit label/address pairs
* APU Source debugging works, however the APU instructions are all implemented as macros, so the macro source is shown. Doh. :)
ACME will need to be improved to output the debug source hierarchy in the PDB, i.e. which file include which file
Or ACME parses APU code directly without macros
Then AddrInfo can be updated to include the source file levels after the primary level...
** Or have the option of reporting the parent in the PDB after encountering a ! pseudo op. The PO would basically say "output the level above this macro" for the lifetime of the macro
!previouscontext added to ACME
* Fix sprites2 alignment (shift 2 pixels to the right) in emulation to match simulation and reduce calculation by 2 extra clocks
* Add sprites2 clock speed selection in emulation
* For if (pixelsSinceLastDebugWrite >= pixelsSinceLastDebugWriteMax)
This test is actually bugged as it outputs something like:
w$ff03ff00,$72004800
d$980f01f8
d$9e0a010f
This does not reflect accurately the position to actually wait for when writing...
Need to debug "APU_ProcessSpriteStrip" and double check the enable is done 16 pixels after the sprite register updates...
The the simulation can be fixed if needed. But data indicates that actually it should be OK.
* Layer combiner for pixel output: com.bdd6502.DisplayBombJack.addLayer
Add a pixel combiner for two inputs to one combined output. This will be useful for the SF2 demo which uses two sprites planes
Copy from @TC-8
Hardware wise the layer enable signal from the pixel header would need to be split to the two input pixel headers
* InitWindow uses the correct window size to reflect the screen display aspect ratio
* Sprites2: Add debugger output to show the used scan time for each line of calculated sprites2
* Best place to add that is: debuggerUpdateRegs
^^ Done
* Removed voicesLoopMask since the Audio 9.3 hardware assumes loops are always enabled now
* Updated non-looped samples to always use the first exported sample 0x80 byte
* Syntax to randomly initialise data in all connected devices and memory
Given randomly initialise all memory using seed 4321
* Added Audio layer debug in the register view, if it's enabled
* APU disassembly seems to ignore locations 0-$10 during the SotB demo?
(Previous fix)
* Break on address
Done - CPU: receivedBreakAt
Done - Add "delete"/"del" command to remove address breakpoints
* BDD6502 needs to output compatible break response...
Also "break" might need to show a list with index values like Vice...
"delete" also
break 300
BREAK: 2 C:$0300 (Stop on exec)
* Add cartridge support
Read CRT files and store their banks and addresses into a map for easy lookup
Add syntax to optionally configure cartridge bank address, whether it is read enabled
Add syntax to specify combinatorial logic (use the variable resolution code ) on multiple locations and enable/disable carts banks, kernal/BASIC ROMs etc
This can also support bank sizes by mapping cart banks with addresses
Obviously more than one rule and multiple ROM binary or CRT files can be specified allowing complex cart bank/ROM arrangements to be created
EF3 and GMod2 need support
com.loomcom.symon.Bus.write processorPort theProcessorPort
Detect theProcessorPort changes
features/C64ROMs.feature
Given a CRT from file "some\file"
Will implicitly enable processor port. Note: IO Write detected:
Done: theProcessorPort needs $17 on reset to emulate the C64
Note: C:\work\c64\stdlib\stdlib.a lines 19-35
Done: However the precise CRT support will need early hooks into Bus read and write
But must only access the cart when ProcessorPortDefault or ProcessorPortCharROMBASICKERNAL
Or for EF3 EASYFLASH_CONTROL_ULTIMAX mode (the default) where the kernal is mapped from the cart
EASYFLASH_CONTROL_8K is used very early if most of my startup code
>> Just assume EASYFLASH_CONTROL_8K?
Then the scenario can start the CPU at the intended vector address
Note these will be needed in the scenario, but don't need code updates:
Given a ROM from file "..\..\VICE\C64\kernal" at $e000
Given a ROM from file "..\..\VICE\C64\basic" at $a000
Given add C64 hardware
Adds devices to cover $d000 - $dfff
Done: Will need expansion to match the processor port config
>> com.loomcom.symon.Bus.buildDeviceAddressArray can have an array of deviceAddressArrayWrite and deviceAddressArrayRead indexed by the lower three bits of the processor port
Bus read and write will need different behaviour as writes go to the underlying RAM
file:///C:/work/C64Docs/unusedino.de/ec64/technical/aay/c64/memcfg.htm
And will automatically map in devices with $a000 $d000 $d800 $e000 addresses
* Need to test bank switching behaviour. Create a function that stores known data when it's called from ROM.
* To aid debugging hardware, displaying the pixel palette index and RGB colour under the mouse cursor hovering over the screen would be very useful
Also which layer it came from
Can be optional syntax since it will probably have a performance impact
Set the title with details?
com.bdd6502.DisplayBombJack.calculatePixel will need to store extra details
Just before some calls to com.bdd6502.QuickDrawPanel.fastSetRGB
com.bdd6502.DisplayBombJack.RepaintWindow to do the update
* APU: Hardware test seems to show the APU will run its code and even the multiplex sprites and raster bars will work
But not in the CPU sending data to the APU (or externally?) interrupts APU processing
The more data sent to the APU the higher the chance the APU will hang
This indicates the bus arbitration is not functioning properly
>> Not entirely a surprise since the memory timing could have short pulses etc. These short pulses happen because the internal bus cut-off (going into RAM U17 APU data) and switch over can happen at any point
** This needs more testing with some simpler example code. Perhaps some raster bars and the CPU not sending any new memory updates either to the APU or the external bus...
* If no workaround can be found, perhaps an alternative design for the bus arbitration
>> Workaround was found (see this commit APUAvoidRasters and APUAvoidRastersTable) however it is slow, so not suitable
For the internal bus writes (from the CPU) rely on the fact that they are relatively slow and store the last 24 bit address and data into temporary latches
>> Done: Instruction and data selection (writes) no longer reset (retry) the APU instruction
The temporary latch writes can use the _IMEWR (IEMEMWRITE)
The register storage for _RESET and ENABLEAPU must still use original IEBS IEA IED and _IMEWR
The internal RAMs the BUSCHOICE must now use the external EBBS EA ED and _MEWR
This would theoretically mean the APU self write its internal memory using external writes, which is OK, but not ideal. Still use the internal write flag.
Using a new 1-of-4 to output to EBS EA ED and _MEWR _MERE (as before):
** When the APU is RESET/disabled then existing write timings from the userport interface are used
** Only when the APU is active are new write timings used
If _RAMSWITCHTOPASSTHROUGH = 0 then use the original IEBS IEA IED IEMEMWRITE IEMEMREAD
If _RAMSWITCHTOPASSTHROUGH = 1 and InterceptBus = 1 then use RINT IDATA _ExternalMEWRPulse
If _RAMSWITCHTOPASSTHROUGH = 1 and InterceptBus = 0 then use latched values logic:
USECACHE and the eventual new generated write pulse are created from a counter and demuxers (like the API instruction)
If the latched EBBS (TLIEBS) is not 0 (indicating something is needed)
At the start of the APU instruction then:
Pause the instruction (not retry it) when trying to increment the PC and the current instruction is not trying to do any internal or external writes (which prioritises APU writes):
PCINCR = 0
and not InternalMEWR
and not InterceptBus
Execute the latched write by using the new 1-of-4 paths: Create a USECACHE and new write pulse (per above note for USECACHE)
>> Update NoLatchedWrite on +ve PCINCR
>> Needs _LATCHEDWRITEPulse
Clear the latched data (end of the generated write cycle) using a new input into _FLUSHCACHE, which allows the APU to continue instructions
This is a fully delayed and cooperative internal/external bus write model instead
>> This would actually remove the need for any instruction retries.
* Debugging notes
During reset and disabled period, instruction and data writes work
** Latched writes "b$11,b$12,b$13,b$14,b$15,b$16,b$17" are unreliable
When writing $4102 $13
Oh bother, it's because the APU instruction is waiting for a HV position
Still unreliable, but $4100 $11 didn't write
Break on LatchedWriteDesired
Debug data clock reverted back to 1M
>> There needs to be some logic that reads LatchedWriteProceed if it's waiting
InstrIsWaiting = 1 when instruction is waiting
$4100 $11 didn't write because its write happens while the previous cached write is being written
PCINCR = 0.8 MHZ so data clock back to 500KHz
Done: Remember to enable tree movement and remove the APUAvoidRastersTable code...
; For MAPUAvoidRastersTest: Define out the next line to simply remove the avoid rasters code
; For MAPUAvoidRastersTest: Comment out the next line to simply remove the table creation
; For MAPUAvoidRastersTest: Comment out the next line to simply remove the test
* Done: Check the _MEWR timings and address setup timings are balanced, for (all in ns):
Same write period Bus Write time Pre Post
A5 _RAMSWITCHTOPASSTHROUGH = 0 2000 (500KHz) 395 120 155 120
A3 InterceptBus = 1 1650 580 182 202 195
A4 LatchedWriteActive = 1 2230 512 160 225 132 (not centred)
Done:
* Need to reconcile data and images for @Demo6 output with emulation and simulated APU
See: "* For APU validation when running @Demo6:"
* Update emulation: When InstrIsWaiting check (debug break/error) that there are no InternalMEWR InterceptBus operations in the previous or current instruction
Also any userport writes must now be delayed, to emulate the latched memory write behaviour including delays for any InternalMEWR/InterceptBus and no delays while HV wait
Also there are a couple of extra cycles if the APU has to pause while a latched write happens
* Final feature file validation
* Replace 6116SA20TPGI with https://www.mouser.sg/ProductDetail/Renesas-Electronics/6116LA20TPG?qs=SmUuHNCnblrwZH5u1F5zAw%3D%3D
* Layout group and packages
* Check for merged devices
* The emulation data output debugData.txt works. There are no reconciliation errors with simulated APU.
But using the APU simulated output DebugAPUOutput.txt generates some problematic video issues and some memory writes seem to be wrong.
This looks like a timing issue to do with the simulated APU output using more frequent waits?
>> Part of the problem was the next HV wait being too soon, so the data gen was missing the event.
This was most obvious by the screen not rendering raster bar effects for part of the display period
This was fixed in the data generator where if the next wait event happens during a write it will be processed
Some update issues remain with the APU simulated data, but not from the emulated output. Most likely also timing based.
>> Check the Sprites2 writes, since their first frame is not correct
Recon is fine...
Try disabling the waits for Sprites2 memory and visually check
WTF!! C:\work\c64\VideoHardware\target\debugData - Copy.txt has:
d$9200010b
d$920101b4
d$92020114
d$92030158
d$92040120
d$92050133
This is expected. But "C:\work\BombJack\output\DebugAPUOutput - Copy.txt" has:
d$9200010b
;@time:0.032650
;delta:0.000002
;w$ff03ff00,$f2005d00
d$920101b4
;@time:0.032652
;delta:0.000002
;w$ff03ff00,$f2006900
d$92020114
;@time:0.032656
;delta:0.000004
;w$ff03ff00,$f2028200
d$92040120
;@time:0.032658
;delta:0.000002
;w$ff03ff00,$f2028e00
d$92050133
The write d$92030158 is completely missing?!!!
Even worse the recon isn't showing any differences!!!
python C:\Work\BombJack\ReconcileData\ReconcileData.py "C:\work\c64\VideoHardware\target\debugData - Copy.txt" "C:\work\BombJack\output\DebugAPUOutput - Copy.txt"
WTFx2 BombJack\ReconcileData\ReconcileData.py has a bug where it wasn't filtering out audio writes properly
oh FFS
The missing write d$92030158 is present in "C:\work\c64\VideoHardware\target\debugDataJustUserPort.txt"
So the new APU must be missing it...
Need a breakpoint on attempting to write that value...
Found the issue, the positive edge on USECLK meant that sometimes USECACHE was very short and basically skipped
This meant the value was not cached and caused missing writes
Capture for output\DebugAPUOutput.txt has "ignore zero writes" enabled, which made it harder to spot
Adding an extra USECLK input for the latch set worked. This gives adequate time from the negative edge to the next positive edge to signal USECACHE
>> Data recon and image recon passed
>> Debug image 7 is wrong when using emulation output, but correct when using simulated APU output.
APU simulated output image animation looks OK.
TODO:
Test place and route
Needed a bigger layout
8 layers, more compact: V9.4\APU - Routed - Tightly 2.pdsprj
Imported: Eight layers.LTF
Needed some planes created, layer pairs, and design rules signals to layers.
* When there is a screen refresh any picked pixel information is refreshed
* Enable/disable pixel picking with 'P' on the keyboard
* Hmm I wonder if I can profile 6502 code by counting cycles for every jsr until its corresponding rts...
Some monitor commands to:
profile start
profile stop
profile print
profile reset
* I will want to note the interrupt status when doing the jsr and only count cycles while it maintains the same status.
* And handle rti and obviously the kernal vectors
* This method provides a live view, partially through a subroutine. It is expensive:
* Whenever a jsr is encountered add it to a map.
* Each cycle accumulates the count in all listed jsr, with the correct status flag.
* When it has a corresponding rts, remove the map entry while noting the accumulated cycle count. Add the total for that jsr to the jsr's address map.
* This method has less performance impact:
* Whenever a jsr is encountered add it to a map with the current cycle count.
* When it has a corresponding rts, remove the map entry while noting the cycle count delta. Add the total for that jsr to the jsr's address map.
* When printing:
Print the address, label, total cycles
* Add syntax to start/stop/print/test profiled code
Given profile start
Given profile clear
Given profile stop
Given profile print
Then property "test.BDD6502.lastProfile" must contain string " : Video_WaitVBlank : "
* Add APU write wait indicator in debug
<<Wait RAM>> will be seen in the APU debug output if the external (user port or APU) memory write causes data to pass through or to the APU
* Emulation to hardware difference
Using the below in Demo6, the left and right border are the same
jsr Video_SetAddressVideoOverscanExtentRegisters
lda #kBus24Bit_VideoLayer_OverscanExtent_Default
sta CIA2PortBRS232
>> But using kBus24Bit_VideoLayer_OverscanExtent_Wide the left border on hardware seems to be missing one pixel, this might be the TV
kBus24Bit_VideoLayer_OverscanExtent_Wide = $39
Using $49 the hardware is the same as the emulation, so ignore this, it is probably the TV or the conversion box...
The video conversion box test card actually outputs a slightly wider signal than the large TV displays. The smaller display shows this
Using the smaller display kBus24Bit_VideoLayer_OverscanExtent_Wide is the same as the emulation, so the large TV is trimming the display
Using $29, which is too far left, does not show any extra picture on the large TV
The smaller display actualy shows an extra 4 pixels compared to the TV, but still not exactly the same as the emulation
Using $2a the smaller display actually shows ~4 pixels more to the left and right edge
Created kBus24Bit_VideoLayer_OverscanExtent_UnsafeWide = $2a for this
This matches precisely the video conversion box output border extent
* Make sure the default window size has rendered pixels that are square, so the regular dithering is consistent
This can be forced over the whole screen area by adding a suitable output pixel xor value to the merge control register
jsr Video_SetAddressMergeLayer
+MBus24Bit_Send8BitValue kBus24Bit_MergeLayer_Register_Control_Dither
+MBus24Bit_Send8BitValue $0f
Added a better scale (2x) in com.bdd6502.DisplayBombJack.InitWindow()
* The rendered video hardware output needs to be moved up 8 pixels to match with the simulation output and also the APU raster values in @Demo6 sky colour changes (for the above line)
* Added "--compressData <input filename> <output filename>" as a general compression option.
* Dither in emulation should match the hardware...
It does now
* Exact address matching emulation
And the display uses exact address matching
And the audio expansion uses exact address matching
And the APU uses exact address matching
And the layer uses exact address matching
* When trying to debug failing code, it would be useful to had syntax to read expected memory for comparison to be used for stores into memory.
Then break on the first instance of a store not matching the expected memory.
This would allow the CPU history to be inspected etc.
New syntax:
Then for memory from "$500" to "$510" expect a write to "$500" with value "$a2"
Given load binary file "test.prg" into temporary memory
And trim "2" bytes from the start of temporary memory
Then for memory from "$500" to "$510" expect writes at "$500" with temporary memory
* Save C64 memory range with optional 2 byte header
When save 6502 memory with two byte header from "$410" to "$418" to file "target/mem2.bin"
When save 6502 memory without two byte header from "$410" to "$418" to file "target/mem2.bin"
Then assert that file "target/mem1.bin" is binary equal to file "target/mem2.bin"
* C64 keyboard buffer hook: Given add C64 display window to C64 keyboard buffer hook
* Add syntax to handle regularly triggered IRQ: Given add C64 regular IRQ trigger of "100000" cycles
* Like the real audio hardware, allow some mixing between left/right channels. They're not completely separated.
Make it variable in syntax.
Given audio mix 85
This adds a left/right mix of 33% (0.33333 * 256 = 85)
* Fix DisplayC64 null references if there is no CHARGEN ROM attached
* C64 and Video display should open side by side
* Need to support Sprites V9.5 layer, which includes the option to clock the pixels at a different clock rate.
@TC-9-95
Given add a Sprites V9.5 layer with registers at '0x9800' and addressEx '0x10' and running at 16MHz
* CPU cycle count to variable: machine.getCpu().getClockCycles()
* Improved the palette emulation by maintaining real palette memory and the palette cached RGB value separately.
* Add: Given set the video display to RGB colour 5 6 5
Default is: Given set the video display to RGB colour 4 4 4
* Add: Given set the video display with 8 palette banks
Default is: Given set the video display with 0 palette banks
Which disables the palette banks feature entirely.