Sunday, April 11, 2010
Update on the Async bug
Well, I've been really busy with life lately, but I recently had some time to get back to code. I've managed to make some progress on the Async bug and what seems to be getting skip are the Audio Interface Interrupts(AII). I can in a hacky way force this not to be skipped in Async mode or in Disabled. For Async, this means the Async bug never activates and this means we don't lose sound (I tested this in SM64 and SSB on the intro). For Disabled, this means games like Turok can boot completely without sound. Also when working with Turok, I discovered that the audio interrupt has to occur at the beginning of the frame rather than the end of it. I believe this is the proper for it to be done in order to reduce some of the popping as well. I be experimenting further with this tonight so that Sync, and Async are both using the same method. Upon a scene change (like when entering a level in SM64), the missing audio interrupt with the bug active is what causes it to freeze due to not updating the registers properly. It's creating a busy wait situation when it's not actually busy and getting stuck in a loop. I believe that all of this should yield multiple benefits in the form of better sound(with fewer delays in between sound processing cycles), faster speed (by only processing the AII we have to), and better compatibility (with more modes working with more titles and without the sound dropping).
Friday, February 19, 2010
Still busy with life, but...
Life is still keeping me busy, but I have had time to do some testing related to the Async bug. From what I can tell it's a semaphore issue causing a race condition which is why there is the intermittent nature of the bug. Generally speaking because DX64 runs at less than 100% speed we just get choppy sound on sync but every now and again we do get burst of over 100%. Using the ME for sound the bursts of speed happening more often(even more often with frame skip.) Whenever the ME is busy and the rest of the emu completely finishes it job and then tries to give the ME more work to do it usually gets added to the back of the line and continues about it's business. Sometimes however, the line is full and there is no more room and the proper signal isn't being sent back to the CPU emulation that it can receive more data, this causes the CPU to not only stop not sending audio data, but also to completely skip other audio related checks as we (DAC Rate changed, etc..) Assuming this is correct, I'm working to force the situation then I'll know for sure. It's the bypassing of the audio interrupt checks and the processing done during those interrupts is where we get our speed boost from the bug from.
Now the big question. Can we use this to our advantage? Yes, no and sometimes. If sound is off and we process the bare minimal checks to prevent lock-ups then, yes. If sound is on and we have to process all the checks then, no not without lock-ups. However, if the sound on and we don't need all the checks for the audio or don't need to check as often then, maybe. Now those are all big if's, but we may see an advantage without sound. I'm currently working on narrowing down first what is bypassed in the CPU (I already know part of it but it doesn't explain the amount of speed we are getting or the lock-ups.) Then seeing what if any of it we can do without or modify for our own purposes through code. (It wouldn't be true emulation, but HLE isn't true emulation to begin with.)
P.S. If this can provide a boost then it would function even after the CPU move. The only question is can it? and can it with audio?
Now the big question. Can we use this to our advantage? Yes, no and sometimes. If sound is off and we process the bare minimal checks to prevent lock-ups then, yes. If sound is on and we have to process all the checks then, no not without lock-ups. However, if the sound on and we don't need all the checks for the audio or don't need to check as often then, maybe. Now those are all big if's, but we may see an advantage without sound. I'm currently working on narrowing down first what is bypassed in the CPU (I already know part of it but it doesn't explain the amount of speed we are getting or the lock-ups.) Then seeing what if any of it we can do without or modify for our own purposes through code. (It wouldn't be true emulation, but HLE isn't true emulation to begin with.)
P.S. If this can provide a boost then it would function even after the CPU move. The only question is can it? and can it with audio?
Wednesday, February 17, 2010
Lack of recent progress and current news
People have been asking for an update, so I thought I'd let everyone know what's going on. The past month and a half has been very busy for me. I'm now in college and working full time. I work 7 days a week 12 hours a day for up to two months straight then I get a week or two off. I was unemployed most of last year, so now I have a lot less time on my hands for code. My job does afford me some downtime most of which goes toward school. I'm currently pursuing a Associates of Applied Science in Computer Programming. I'm also looking at getting several IT certifications along the way to help keep my options open when I can finally make a career change. I do still have a decent amount of left over time most of which has been spent working on the forums and dealing with staff issues. The main changes I had planned there are done. (I got a staff hierarchy set up in order to reduce issues there and integrated the compatibility list with the site.)
On the code side of things I've done some minor code clean-ups and I'm currently working on a few regressions. Along the way I've learned to better debug crashes which is a good thing. I also made my first attempt at moving code to the ME. I tried to move the "JPEG_Task" to the ME but this didn't go so well. So far it gets about half way through the code for it and crashes. Crashes on the ME are hard to debug, but this will be good practice for the CPU move. I've also started a full port of 1964Video(Rice 6.3.0r35) from the 1964 SVN which should kill 90% of the graphical errors and a few crashes from within the graphics plug-in. It will also increase the overall compatibility of the emulator as well. This will all take a while. The JPEG_Task isn't planned to be move and I'm only doing this as it much simpler than the CPU_Task and a good case for learning. As I stated I'm working on the regressions I introduced in Rev 432, this should only take a few days. Then I'll be back to debugging my JPEG related code and porting Rice.
On a side note, both school and work are going well. Salvy has been adding a few bits of code related to uCodes and the blender lately and one of our forum members (Yamagushi) has implemented a new GUI on Rev 463. I still have further plans for the GUI but his changes are a good starting point for my plans.
On the code side of things I've done some minor code clean-ups and I'm currently working on a few regressions. Along the way I've learned to better debug crashes which is a good thing. I also made my first attempt at moving code to the ME. I tried to move the "JPEG_Task" to the ME but this didn't go so well. So far it gets about half way through the code for it and crashes. Crashes on the ME are hard to debug, but this will be good practice for the CPU move. I've also started a full port of 1964Video(Rice 6.3.0r35) from the 1964 SVN which should kill 90% of the graphical errors and a few crashes from within the graphics plug-in. It will also increase the overall compatibility of the emulator as well. This will all take a while. The JPEG_Task isn't planned to be move and I'm only doing this as it much simpler than the CPU_Task and a good case for learning. As I stated I'm working on the regressions I introduced in Rev 432, this should only take a few days. Then I'll be back to debugging my JPEG related code and porting Rice.
On a side note, both school and work are going well. Salvy has been adding a few bits of code related to uCodes and the blender lately and one of our forum members (Yamagushi) has implemented a new GUI on Rev 463. I still have further plans for the GUI but his changes are a good starting point for my plans.
Labels:
Beta 3,
Clean up,
CPU,
Crashes,
DaedalusX64,
Downtime,
Dynarec,
GUI,
Life,
Media Engine,
Regressions,
Salvy,
uCodes,
Yamagushi
Wednesday, December 30, 2009
Happy Holidays and Current Progress
Here's a late Happy Holidays and a Happy New Year to all who read this.
This will be my final update of the year. Like I said in my last post I started from scratch, but I did quickly run into some problems. Everything is so interconnected and interwoven that going the route I was just isn't make sense any more. So I'm back to my prior method of taking things apart. Once, I've got sound and graphics completely separated out then I'll work on separating out the RSP and the Dynarec/CPU last of all. Then I can take the code I cobbled together and start bring the other parts back in one at a time.
I've been rather busy this month with the Holidays and my schedule overall won't be letting up anytime soon, but I will be working on the code in what little spare time I get. My next post will likely be around the end of next month. Hopefully, I'll have more news by then.
This will be my final update of the year. Like I said in my last post I started from scratch, but I did quickly run into some problems. Everything is so interconnected and interwoven that going the route I was just isn't make sense any more. So I'm back to my prior method of taking things apart. Once, I've got sound and graphics completely separated out then I'll work on separating out the RSP and the Dynarec/CPU last of all. Then I can take the code I cobbled together and start bring the other parts back in one at a time.
I've been rather busy this month with the Holidays and my schedule overall won't be letting up anytime soon, but I will be working on the code in what little spare time I get. My next post will likely be around the end of next month. Hopefully, I'll have more news by then.
Thursday, December 3, 2009
Rev 444 and the beginnings of Beta 3
Rev 444 was just regression fixes and a bit of code clean-up from some of my prior commits. Now, I'm on to the major recode for Beta 3. I've changed the way I'm going about it from my prior post here. I've decided instead of working in reverse removing parts of DX64 and working backwards, I will start from scratch, import code as needed, and work forward. This way I can ensure all the code gets looked at and I would've basically been doing this after I finished with the code disassembly.
I've talked with Strmnnrmn about how I plan to do the ME move to get a normally serialized process working asynchronously. Right now in DX64 it goes:
1.Dynarec
2.CPU Execution
3.Graphics
4.Audio (Although in Async this is basically bypassed on the SC.)
What all this means is I will basically be recoding everything from scratch and looking at all of the code for optimization and memory usage as I go. This includes uCodes, graphics, audio, the combiner, RSP emulation, etc... I literally mean every line of code. I got the go ahead from Strmnnrmn to do this, but I'm not going to be committing the changes as I go as not to break the SVN. I will also be looking at the new GUI and preparing for it as well as all the other plans I had listed for Beta 3. Next time I post, I will update my progress on all of this and list any issues I'm running in to.
I've talked with Strmnnrmn about how I plan to do the ME move to get a normally serialized process working asynchronously. Right now in DX64 it goes:
1.Dynarec
2.CPU Execution
3.Graphics
4.Audio (Although in Async this is basically bypassed on the SC.)
Which is basically serialized access to the individual components. To make this a truly multi-processing app. When the move is done it will go:
First Two Frames:
SC | ME
1. Dynarec (Frame 1) |
3. Dynarec (Frame 2) | 2. CPU Execution (Frame 1)
4.2 Graphics (Frame 1) | 4.1 CPU Execution (Frame 2)
4.3 Audio (Frame 1) |
Subsequently: (Frame X = Current Frame, X-1 = Last frame)
SC | ME
1. Dynarec (Frame X) | 0. (Finishing execution of X-1)
2.2 Graphics (Frame X-1)| 2.1 CPU Execution (Frame X)
2.3 Audio (Frame X-1) |
This will mean at any given time DX64 will be working on two frames at once. I want the Dynarec, CPU Execution and Audio to all individually be able to be moved between the SC and the ME through build flags as needed for debugging purposes. Graphics will always be on the SC as the ME has no GE access. (The only exception there might be some math heavy tasks like clipping, but without a VFPU I think this should stay where it's at.)SC | ME
1. Dynarec (Frame 1) |
3. Dynarec (Frame 2) | 2. CPU Execution (Frame 1)
4.2 Graphics (Frame 1) | 4.1 CPU Execution (Frame 2)
4.3 Audio (Frame 1) |
Subsequently: (Frame X = Current Frame, X-1 = Last frame)
SC | ME
1. Dynarec (Frame X) | 0. (Finishing execution of X-1)
2.2 Graphics (Frame X-1)| 2.1 CPU Execution (Frame X)
2.3 Audio (Frame X-1) |
What all this means is I will basically be recoding everything from scratch and looking at all of the code for optimization and memory usage as I go. This includes uCodes, graphics, audio, the combiner, RSP emulation, etc... I literally mean every line of code. I got the go ahead from Strmnnrmn to do this, but I'm not going to be committing the changes as I go as not to break the SVN. I will also be looking at the new GUI and preparing for it as well as all the other plans I had listed for Beta 3. Next time I post, I will update my progress on all of this and list any issues I'm running in to.
Wednesday, November 11, 2009
Rev 443 and the next few revs
For Rev 443, I tracked down the issue plaguing Paper Mario. Since it was a regression, I went about it the way I normally find regressions I found the last revision that didn't have the regression and I checked the changes to the next revision after looking at the purpose of that particular revision. In this case it was rev 431 and 432 respectively. However, Rev 432 had 50 files and a couple hundred lines of code changed. So, over the course of two days I went through each file one by one and it turned out to be one line one 1 file which was the last one I went through. In Rev 432, I reverted some changes from some older Revs to fix a few regressions causing lock-ups in some titles. Now, I need to finish going back through the changes and see which are needed for speed and preparation for the ME move. After that, I have a few quick ideas I'll look at for further speed. After they are implemented I'll explain them here. Hopefully after the speed regressions and my new ideas we should see an overall boost in speed. Then it's on to the Async bug and the rest of the changes leading to Beta 3.
Labels:
Beta 3,
DaedalusX64,
Dynarec,
Paper Mario,
Regressions,
Speed
Friday, October 16, 2009
Rev 435 and upcoming...
Well lately I've been trying to track down why when a specific "bug" happens we get about a 46% increase in speed on Mario 64. In the process of trying to track it down. I noticed a few things that either got reverted in 432 and shouldn't have and 1 more small speed-up I noticed where logic on a function could be improved. The speed-up seems to be fully dynarec related, but will take more time to track down. Next I plan to see what part of the 432 changes caused Paper Mario to start freezing so much. Stability (or consistency ) of games the play is definitely needed before we can proceed to move the CPU to the ME. I added back the reverted code and another speed up or two for 435. I also added an advanced options menu as a secondary menu to the ROM options. These are all the extra items that can be enabled in the roms.ini file. Most of them cause breaks in many titles which is why they are on the advanced options, but you can try them out to see if they help the speed or graphics in other titles besides the ones they are specifically written for. Enjoy
P.S. Sorry this post is late, but the Rev has been posted for a few days.
P.S. Sorry this post is late, but the Rev has been posted for a few days.
Subscribe to:
Posts (Atom)