Vulnerability History
Date | High Risk | Low Risk |
---|---|---|
2024-11-11 | 2 | 1 |
Audit Report Details
18500
Lines of Code
5
Open
0
Resolved
🚨 High Risk Vulnerabilities
⚠️ Low Risk Vulnerabilities
Vulnerable Code:
1---
2File: /contrib/CODE_REVIEW_DOCS.md
3---
4
5# Code Review
6### Conceptual Review
7
8A review can be a conceptual review, where the reviewer leaves a comment
9 * `Concept (N)ACK`, meaning "I do (not) agree with the general goal of this pull
10 request",
11 * `Approach (N)ACK`, meaning `Concept ACK`, but "I do (not) agree with the
12 approach of this change".
13
14A `NACK` needs to include a rationale why the change is not worthwhile.
15NACKs without accompanying reasoning may be disregarded.
16After conceptual agreement on the change, code review can be provided. A review
17begins with `ACK BRANCH_COMMIT`, where `BRANCH_COMMIT` is the top of the PR
18branch, followed by a description of how the reviewer did the review. The
19following language is used within pull request comments:
20
21 - "I have tested the code", involving change-specific manual testing in
22 addition to running the unit, functional, or fuzz tests, and in case it is
23 not obvious how the manual testing was done, it should be described;
24 - "I have not tested the code, but I have reviewed it and it looks
25 OK, I agree it can be merged";
26 - A "nit" refers to a trivial, often non-blocking issue.
27
28### Code Review
29Project maintainers reserve the right to weigh the opinions of peer reviewers
30using common sense judgement and may also weigh based on merit. Reviewers that
31have demonstrated a deeper commitment and understanding of the project over time
32or who have clear domain expertise may naturally have more weight, as one would
33expect in all walks of life.
34
35Where a patch set affects consensus-critical code, the bar will be much
36higher in terms of discussion and peer review requirements, keeping in mind that
37mistakes could be very costly to the wider community. This includes refactoring
38of consensus-critical code.
39
40Where a patch set proposes to change the Bittensor consensus, it must have been
41discussed extensively on the discord server and other channels, be accompanied by a widely
42discussed BIP and have a generally widely perceived technical consensus of being
43a worthwhile change based on the judgement of the maintainers.
44
45### Finding Reviewers
46
47As most reviewers are themselves developers with their own projects, the review
48process can be quite lengthy, and some amount of patience is required. If you find
49that you've been waiting for a pull request to be given attention for several
50months, there may be a number of reasons for this, some of which you can do something
51about:
52
53 - It may be because of a feature freeze due to an upcoming release. During this time,
54 only bug fixes are taken into consideration. If your pull request is a new feature,
55 it will not be prioritized until after the release. Wait for the release.
56 - It may be because the changes you are suggesting do not appeal to people. Rather than
57 nits and critique, which require effort and means they care enough to spend time on your
58 contribution, thundering silence is a good sign of widespread (mild) dislike of a given change
59 (because people don't assume *others* won't actually like the proposal). Don't take
60 that personally, though! Instead, take another critical look at what you are suggesting
61 and see if it: changes too much, is too broad, doesn't adhere to the
62 [developer notes](DEVELOPMENT_WORKFLOW.md), is dangerous or insecure, is messily written, etc.
63 Identify and address any of the issues you find. Then ask e.g. on IRC if someone could give
64 their opinion on the concept itself.
65 - It may be because your code is too complex for all but a few people, and those people
66 may not have realized your pull request even exists. A great way to find people who
67 are qualified and care about the code you are touching is the
68 [Git Blame feature](https://docs.github.com/en/github/managing-files-in-a-repository/managing-files-on-github/tracking-changes-in-a-file). Simply
69 look up who last modified the code you are changing and see if you can find
70 them and give them a nudge. Don't be incessant about the nudging, though.
71 - Finally, if all else fails, ask on IRC or elsewhere for someone to give your pull request
72 a look. If you think you've been waiting for an unreasonably long time (say,
73 more than a month) for no particular reason (a few lines changed, etc.),
74 this is totally fine. Try to return the favor when someone else is asking
75 for feedback on their code, and the universe balances out.
76 - Remember that the best thing you can do while waiting is give review to others!
77
78
79---
80File: /contrib/CONTRIBUTING.md
81---
82
83# Contributing to Bittensor Subnet Development
84
85The following is a set of guidelines for contributing to the Bittensor ecosystem. These are **HIGHLY RECOMMENDED** guidelines, but not hard-and-fast rules. Use your best judgment, and feel free to propose changes to this document in a pull request.
86
87## Table Of Contents
881. [How Can I Contribute?](#how-can-i-contribute)
89 1. [Communication Channels](#communication-channels)
90 1. [Code Contribution General Guideline](#code-contribution-general-guidelines)
91 1. [Pull Request Philosophy](#pull-request-philosophy)
92 1. [Pull Request Process](#pull-request-process)
93 1. [Addressing Feedback](#addressing-feedback)
94 1. [Squashing Commits](#squashing-commits)
95 1. [Refactoring](#refactoring)
96 1. [Peer Review](#peer-review)
97 1. [Suggesting Features](#suggesting-enhancements-and-features)
98
99
100## How Can I Contribute?
101TODO(developer): Define your desired contribution procedure.
102
103## Communication Channels
104TODO(developer): Place your communication channels here
105
106> Please follow the Bittensor Subnet [style guide](./STYLE.md) regardless of your contribution type.
107
108Here is a high-level summary:
109- Code consistency is crucial; adhere to established programming language conventions.
110- Use `black` to format your Python code; it ensures readability and consistency.
111- Write concise Git commit messages; summarize changes in ~50 characters.
112- Follow these six commit rules:
113 - Atomic Commits: Focus on one task or fix per commit.
114 - Subject and Body Separation: Use a blank line to separate the subject from the body.
115 - Subject Line Length: Keep it under 50 characters for readability.
116 - Imperative Mood: Write subject line as if giving a command or instruction.
117 - Body Text Width: Wrap text manually at 72 characters.
118 - Body Content: Explain what changed and why, not how.
119- Make use of your commit messages to simplify project understanding and maintenance.
120
121> For clear examples of each of the commit rules, see the style guide's [rules](./STYLE.md#the-six-rules-of-a-great-commit) section.
122
123### Code Contribution General Guidelines
124
125> Review the Bittensor Subnet [style guide](./STYLE.md) and [development workflow](./DEVELOPMENT_WORKFLOW.md) before contributing.
126
127
128#### Pull Request Philosophy
129
130Patchsets and enhancements should always be focused. A pull request could add a feature, fix a bug, or refactor code, but it should not contain a mixture of these. Please also avoid 'super' pull requests which attempt to do too much, are overly large, or overly complex as this makes review difficult.
131
132Specifically, pull requests must adhere to the following criteria:
133- Contain fewer than 50 files. PRs with more than 50 files will be closed.
134- If a PR introduces a new feature, it *must* include corresponding tests.
135- Other PRs (bug fixes, refactoring, etc.) should ideally also have tests, as they provide proof of concept and prevent regression.
136- Categorize your PR properly by using GitHub labels. This aids in the review process by informing reviewers about the type of change at a glance.
137- Make sure your code includes adequate comments. These should explain why certain decisions were made and how your changes work.
138- If your changes are extensive, consider breaking your PR into smaller, related PRs. This makes your contributions easier to understand and review.
139- Be active in the discussion about your PR. Respond promptly to comments and questions to help reviewers understand your changes and speed up the acceptance process.
140
141Generally, all pull requests must:
142
143 - Have a clear use case, fix a demonstrable bug or serve the greater good of the project (e.g. refactoring for modularisation).
144 - Be well peer-reviewed.
145 - Follow code style guidelines.
146 - Not break the existing test suite.
147 - Where bugs are fixed, where possible, there should be unit tests demonstrating the bug and also proving the fix.
148 - Change relevant comments and documentation when behaviour of code changes.
149
150#### Pull Request Process
151
152Please follow these steps to have your contribution considered by the maintainers:
153
154*Before* creating the PR:
1551. Read the [development workflow](./DEVELOPMENT_WORKFLOW.md) defined for this repository to understand our workflow.
1562. Ensure your PR meets the criteria stated in the 'Pull Request Philosophy' section.
1573. Include relevant tests for any fixed bugs or new features as stated in the [testing guide](./TESTING.md).
1584. Ensure your commit messages are clear and concise. Include the issue number if applicable.
1595. If you have multiple commits, rebase them into a single commit using `git rebase -i`.
1606. Explain what your changes do and why you think they should be merged in the PR description consistent with the [style guide](./STYLE.md).
161
162*After* creating the PR:
1631. Verify that all [status checks](https://help.github.com/articles/about-status-checks/) are passing after you submit your pull request.
1642. Label your PR using GitHub's labeling feature. The labels help categorize the PR and streamline the review process.
1653. Document your code with comments that provide a clear understanding of your changes. Explain any non-obvious parts of your code or design decisions you've made.
1664. If your PR has extensive changes, consider splitting it into smaller, related PRs. This reduces the cognitive load on the reviewers and speeds up the review process.
167
168Please be responsive and participate in the discussion on your PR! This aids in clarifying any confusion or concerns and leads to quicker resolution and merging of your PR.
169
170> Note: If your changes are not ready for merge but you want feedback, create a draft pull request.
171
172Following these criteria will aid in quicker review and potential merging of your PR.
173While the prerequisites above must be satisfied prior to having your pull request reviewed, the reviewer(s) may ask you to complete additional design work, tests, or other changes before your pull request can be ultimately accepted.
174
175When you are ready to submit your changes, create a pull request:
176
177> **Always** follow the [style guide](./STYLE.md) and [development workflow](./DEVELOPMENT_WORKFLOW.md) before submitting pull requests.
178
179After you submit a pull request, it will be reviewed by the maintainers. They may ask you to make changes. Please respond to any comments and push your changes as a new commit.
180
181> Note: Be sure to merge the latest from "upstream" before making a pull request:
182
183```bash
184git remote add upstream https://github.com/opentensor/bittensor.git # TODO(developer): replace with your repo URL
185git fetch upstream
186git merge upstream/<your-branch-name>
187git push origin <your-branch-name>
188```
189
190#### Addressing Feedback
191
192After submitting your pull request, expect comments and reviews from other contributors. You can add more commits to your pull request by committing them locally and pushing to your fork.
193
194You are expected to reply to any review comments before your pull request is merged. You may update the code or reject the feedback if you do not agree with it, but you should express so in a reply. If there is outstanding feedback and you are not actively working on it, your pull request may be closed.
195
196#### Squashing Commits
197
198If your pull request contains fixup commits (commits that change the same line of code repeatedly) or too fine-grained commits, you may be asked to [squash](https://git-scm.com/docs/git-rebase#_interactive_mode) your commits before it will be reviewed. The basic squashing workflow is shown below.
199
200 git checkout your_branch_name
201 git rebase -i HEAD~n
202 # n is normally the number of commits in the pull request.
203 # Set commits (except the one in the first line) from 'pick' to 'squash', save and quit.
204 # On the next screen, edit/refine commit messages.
205 # Save and quit.
206 git push -f # (force push to GitHub)
207
208Please update the resulting commit message, if needed. It should read as a coherent message. In most cases, this means not just listing the interim commits.
209
210If your change contains a merge commit, the above workflow may not work and you will need to remove the merge commit first. See the next section for details on how to rebase.
211
212Please refrain from creating several pull requests for the same change. Use the pull request that is already open (or was created earlier) to amend changes. This preserves the discussion and review that happened earlier for the respective change set.
213
214The length of time required for peer review is unpredictable and will vary from pull request to pull request.
215
216#### Refactoring
217
218Refactoring is a necessary part of any software project's evolution. The following guidelines cover refactoring pull requests for the project.
219
220There are three categories of refactoring: code-only moves, code style fixes, and code refactoring. In general, refactoring pull requests should not mix these three kinds of activities in order to make refactoring pull requests easy to review and uncontroversial. In all cases, refactoring PRs must not change the behaviour of code within the pull request (bugs must be preserved as is).
221
222Project maintainers aim for a quick turnaround on refactoring pull requests, so where possible keep them short, uncomplex and easy to verify.
223
224Pull requests that refactor the code should not be made by new contributors. It requires a certain level of experience to know where the code belongs to and to understand the full ramification (including rebase effort of open pull requests). Trivial pull requests or pull requests that refactor the code with no clear benefits may be immediately closed by the maintainers to reduce unnecessary workload on reviewing.
225
226#### Peer Review
227
228Anyone may participate in peer review which is expressed by comments in the pull request. Typically reviewers will review the code for obvious errors, as well as test out the patch set and opine on the technical merits of the patch. Project maintainers take into account the peer review when determining if there is consensus to merge a pull request (remember that discussions may have taken place elsewhere, not just on GitHub). The following language is used within pull-request comments:
229
230- ACK means "I have tested the code and I agree it should be merged";
231- NACK means "I disagree this should be merged", and must be accompanied by sound technical justification. NACKs without accompanying reasoning may be disregarded;
232- utACK means "I have not tested the code, but I have reviewed it and it looks OK, I agree it can be merged";
233- Concept ACK means "I agree in the general principle of this pull request";
234- Nit refers to trivial, often non-blocking issues.
235
236Reviewers should include the commit(s) they have reviewed in their comments. This can be done by copying the commit SHA1 hash.
237
238A pull request that changes consensus-critical code is considerably more involved than a pull request that adds a feature to the wallet, for example. Such patches must be reviewed and thoroughly tested by several reviewers who are knowledgeable about the changed subsystems. Where new features are proposed, it is helpful for reviewers to try out the patch set on a test network and indicate that they have done so in their review. Project maintainers will take this into consideration when merging changes.
239
240For a more detailed description of the review process, see the [Code Review Guidelines](CODE_REVIEW_DOCS.md).
241
242> **Note:** If you find a **Closed** issue that seems like it is the same thing that you're experiencing, open a new issue and include a link to the original issue in the body of your new one.
243
244#### How Do I Submit A (Good) Bug Report?
245
246Please track bugs as GitHub issues.
247
248Explain the problem and include additional details to help maintainers reproduce the problem:
249
250* **Use a clear and descriptive title** for the issue to identify the problem.
251* **Describe the exact steps which reproduce the problem** in as many details as possible. For example, start by explaining how you started the application, e.g. which command exactly you used in the terminal, or how you started Bittensor otherwise. When listing steps, **don't just say what you did, but explain how you did it**. For example, if you ran with a set of custom configs, explain if you used a config file or command line arguments.
252* **Provide specific examples to demonstrate the steps**. Include links to files or GitHub projects, or copy/pasteable snippets, which you use in those examples. If you're providing snippets in the issue, use [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
253* **Describe the behavior you observed after following the steps** and point out what exactly is the problem with that behavior.
254* **Explain which behavior you expected to see instead and why.**
255* **Include screenshots and animated GIFs** which show you following the described steps and clearly demonstrate the problem. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
256* **If you're reporting that Bittensor crashed**, include a crash report with a stack trace from the operating system. On macOS, the crash report will be available in `Console.app` under "Diagnostic and usage information" > "User diagnostic reports". Include the crash report in the issue in a [code block](https://help.github.com/articles/markdown-basics/#multiple-lines), a [file attachment](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/), or put it in a [gist](https://gist.github.com/) and provide link to that gist.
257* **If the problem is related to performance or memory**, include a CPU profile capture with your report, if you're using a GPU then include a GPU profile capture as well. Look into the [PyTorch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) to look at memory usage of your model.
258* **If the problem wasn't triggered by a specific action**, describe what you were doing before the problem happened and share more information using the guidelines below.
259
260Provide more context by answering these questions:
261
262* **Did the problem start happening recently** (e.g. after updating to a new version) or was this always a problem?
263* If the problem started happening recently, **can you reproduce the problem in an older version of Bittensor?**
264* **Can you reliably reproduce the issue?** If not, provide details about how often the problem happens and under which conditions it normally happens.
265
266Include details about your configuration and environment:
267
268* **Which version of Bittensor Subnet are you using?**
269* **What commit hash are you on?** You can get the exact commit hash by checking `git log` and pasting the full commit hash.
270* **What's the name and version of the OS you're using**?
271* **Are you running Bittensor Subnet in a virtual machine?** If so, which VM software are you using and which operating systems and versions are used for the host and the guest?
272* **Are you running Bittensor Subnet in a dockerized container?** If so, have you made sure that your docker container contains your latest changes and is up to date with Master branch?
273
274### Suggesting Enhancements and Features
275
276This section guides you through submitting an enhancement suggestion, including completely new features and minor improvements to existing functionality. Following these guidelines helps maintainers and the community understand your suggestion :pencil: and find related suggestions :mag_right:.
277
278When you are creating an enhancement suggestion, please [include as many details as possible](#how-do-i-submit-a-good-enhancement-suggestion). Fill in [the template](https://bit.ly/atom-behavior-pr), including the steps that you imagine you would take if the feature you're requesting existed.
279
280#### Before Submitting An Enhancement Suggestion
281
282* **Check the [debugging guide](./DEBUGGING.md).** for tips — you might discover that the enhancement is already available. Most importantly, check if you're using the latest version of the project first.
283
284#### How Submit A (Good) Feature Suggestion
285
286* **Use a clear and descriptive title** for the issue to identify the problem.
287* **Provide a step-by-step description of the suggested enhancement** in as many details as possible.
288* **Provide specific examples to demonstrate the steps**. Include copy/pasteable snippets which you use in those examples, as [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
289* **Describe the current behavior** and **explain which behavior you expected to see instead** and why.
290* **Include screenshots and animated GIFs** which help you demonstrate the steps or point out the part of the project which the suggestion is related to. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
291* **Explain why this enhancement would be useful** to most users.
292* **List some other text editors or applications where this enhancement exists.**
293* **Specify the name and version of the OS you're using.**
294
295Thank you for considering contributing to Bittensor! Any help is greatly appreciated along this journey to incentivize open and permissionless intelligence.
296
297
298
299---
300File: /contrib/DEVELOPMENT_WORKFLOW.md
301---
302
303# Bittensor Subnet Development Workflow
304
305This is a highly advisable workflow to follow to keep your subtensor project organized and foster ease of contribution.
306
307## Table of contents
308
309- [Bittensor Subnet Development Workflow](#bittensor-subnet-development-workflow)
310 - [Main Branches](#main-branches)
311 - [Development Model](#development-model)
312 - [Feature Branches](#feature-branches)
313 - [Release Branches](#release-branches)
314 - [Hotfix Branches](#hotfix-branches)
315 - [Git Operations](#git-operations)
316 - [Creating a Feature Branch](#creating-a-feature-branch)
317 - [Merging Feature Branch into Staging](#merging-feature-branch-into-staging)
318 - [Creating a Release Branch](#creating-a-release-branch)
319 - [Finishing a Release Branch](#finishing-a-release-branch)
320 - [Creating a Hotfix Branch](#creating-a-hotfix-branch)
321 - [Finishing a Hotfix Branch](#finishing-a-hotfix-branch)
322 - [Continuous Integration (CI) and Continuous Deployment (CD)](#continuous-integration-ci-and-continuous-deployment-cd)
323 - [Versioning and Release Notes](#versioning-and-release-notes)
324 - [Pending Tasks](#pending-tasks)
325
326## Main Branches
327
328Bittensor's codebase consists of two main branches: **main** and **staging**.
329
330**main**
331- This is Bittensor's live production branch, which should only be updated by the core development team. This branch is protected, so refrain from pushing or merging into it unless authorized.
332
333**staging**
334- This branch is continuously updated and is where you propose and merge changes. It's essentially Bittensor's active development branch.
335
336## Development Model
337
338### Feature Branches
339
340- Branch off from: `staging`
341- Merge back into: `staging`
342- Naming convention: `feature/<ticket>/<descriptive-sentence>`
343
344Feature branches are used to develop new features for upcoming or future releases. They exist as long as the feature is in development, but will eventually be merged into `staging` or discarded. Always delete your feature branch after merging to avoid unnecessary clutter.
345
346### Release Branches
347
348- Branch off from: `staging`
349- Merge back into: `staging` and then `main`
350- Naming convention: `release/<version>/<descriptive-message>/<creator's-name>`
351
352Release branches support the preparation of a new production release, allowing for minor bug fixes and preparation of metadata (version number, configuration, etc). All new features should be merged into `staging` and wait for the next big release.
353
354### Hotfix Branches
355
356General workflow:
357
358- Branch off from: `main` or `staging`
359- Merge back into: `staging` then `main`
360- Naming convention: `hotfix/<version>/<descriptive-message>/<creator's-name>`
361
362Hotfix branches are meant for quick fixes in the production environment. When a critical bug in a production version must be resolved immediately, a hotfix branch is created.
363
364## Git Operations
365
366#### Create a feature branch
367
3681. Branch from the **staging** branch.
369 1. Command: `git checkout -b feature/my-feature staging`
370
371> Rebase frequently with the updated staging branch so you do not face big conflicts before submitting your pull request. Remember, syncing your changes with other developers could also help you avoid big conflicts.
372
373#### Merge feature branch into staging
374
375In other words, integrate your changes into a branch that will be tested and prepared for release.
376
3771. Switch branch to staging: `git checkout staging`
3782. Merging feature branch into staging: `git merge --no-ff feature/my-feature`
3793. Pushing changes to staging: `git push origin staging`
3804. Delete feature branch: `git branch -d feature/my-feature` (alternatively, this can be navigated on the GitHub web UI)
381
382This operation is done by Github when merging a PR.
383
384So, what you have to keep in mind is:
385- Open the PR against the `staging` branch.
386- After merging a PR you should delete your feature branch. This will be strictly enforced.
387
388#### Creating a release branch
389
3901. Create branch from staging: `git checkout -b release/3.4.0/descriptive-message/creator's_name staging`
3912. Updating version with major or minor: `./scripts/update_version.sh major|minor`
3923. Commit file changes with new version: `git commit -a -m "Updated version to 3.4.0"`
393
394
395#### Finishing a Release Branch
396
397This involves releasing stable code and generating a new version for bittensor.
398
3991. Switch branch to main: `git checkout main`
4002. Merge release branch into main: `git merge --no-ff release/3.4.0/optional-descriptive-message`
4013. Tag changeset: `git tag -a v3.4.0 -m "Releasing v3.4.0: some comment about it"`
4024. Push changes to main: `git push origin main`
4035. Push tags to origin: `git push origin --tags`
404
405To keep the changes made in the __release__ branch, we need to merge those back into `staging`:
406
407- Switch branch to staging: `git checkout staging`.
408- Merging release branch into staging: `git merge --no-ff release/3.4.0/optional-descriptive-message`
409
410This step may well lead to a merge conflict (probably even, since we have changed the version number). If so, fix it and commit.
411
412
413#### Creating a hotfix branch
4141. Create branch from main: `git checkout -b hotfix/3.3.4/descriptive-message/creator's-name main`
4152. Update patch version: `./scripts/update_version.sh patch`
4163. Commit file changes with new version: `git commit -a -m "Updated version to 3.3.4"`
4174. Fix the bug and commit the fix: `git commit -m "Fixed critical production issue X"`
418
419#### Finishing a Hotfix Branch
420
421Finishing a hotfix branch involves merging the bugfix into both `main` and `staging`.
422
4231. Switch branch to main: `git checkout main`
4242. Merge hotfix into main: `git merge --no-ff hotfix/3.3.4/optional-descriptive-message`
4253. Tag new version: `git tag -a v3.3.4 -m "Releasing v3.3.4: descriptive comment about the hotfix"`
4264. Push changes to main: `git push origin main`
4275. Push tags to origin: `git push origin --tags`
4286. Switch branch to staging: `git checkout staging`
4297. Merge hotfix into staging: `git merge --no-ff hotfix/3.3.4/descriptive-message/creator's-name`
4308. Push changes to origin/staging: `git push origin staging`
4319. Delete hotfix branch: `git branch -d hotfix/3.3.4/optional-descriptive-message`
432
433The one exception to the rule here is that, **when a release branch currently exists, the hotfix changes need to be merged into that release branch, instead of** `staging`. Back-merging the bugfix into the __release__ branch will eventually result in the bugfix being merged into `develop` too, when the release branch is finished. (If work in develop immediately requires this bugfix and cannot wait for the release branch to be finished, you may safely merge the bugfix into develop now already as well.)
434
435Finally, we remove the temporary branch:
436
437- `git branch -d hotfix/3.3.4/optional-descriptive-message`
438## Continuous Integration (CI) and Continuous Deployment (CD)
439
440Continuous Integration (CI) is a software development practice where members of a team integrate their work frequently. Each integration is verified by an automated build and test process to detect integration errors as quickly as possible.
441
442Continuous Deployment (CD) is a software engineering approach in which software functionalities are delivered frequently through automated deployments.
443
444- **CircleCI job**: Create jobs in CircleCI to automate the merging of staging into main and release version (needed to release code) and building and testing Bittensor (needed to merge PRs).
445
446> It is highly recommended to set up your own circleci pipeline with your subnet
447
448## Versioning and Release Notes
449
450Semantic versioning helps keep track of the different versions of the software. When code is merged into main, generate a new version.
451
452Release notes provide documentation for each version released to the users, highlighting the new features, improvements, and bug fixes. When merged into main, generate GitHub release and release notes.
453
454## Pending Tasks
455
456Follow these steps when you are contributing to the bittensor subnet:
457
458- Determine if main and staging are different
459- Determine what is in staging that is not merged yet
460 - Document not released developments
461 - When merged into staging, generate information about what's merged into staging but not released.
462 - When merged into main, generate GitHub release and release notes.
463- CircleCI jobs
464 - Merge staging into main and release version (needed to release code)
465 - Build and Test Bittensor (needed to merge PRs)
466
467This document can be improved as the Bittensor project continues to develop and change.
468
469
470
471---
472File: /contrib/STYLE.md
473---
474
475# Style Guide
476
477A project’s long-term success rests (among other things) on its maintainability, and a maintainer has few tools more powerful than his or her project’s log. It’s worth taking the time to learn how to care for one properly. What may be a hassle at first soon becomes habit, and eventually a source of pride and productivity for all involved.
478
479Most programming languages have well-established conventions as to what constitutes idiomatic style, i.e. naming, formatting and so on. There are variations on these conventions, of course, but most developers agree that picking one and sticking to it is far better than the chaos that ensues when everybody does their own thing.
480
481# Table of Contents
4821. [Code Style](#code-style)
4832. [Naming Conventions](#naming-conventions)
4843. [Git Commit Style](#git-commit-style)
4854. [The Six Rules of a Great Commit](#the-six-rules-of-a-great-commit)
486 - [1. Atomic Commits](#1-atomic-commits)
487 - [2. Separate Subject from Body with a Blank Line](#2-separate-subject-from-body-with-a-blank-line)
488 - [3. Limit the Subject Line to 50 Characters](#3-limit-the-subject-line-to-50-characters)
489 - [4. Use the Imperative Mood in the Subject Line](#4-use-the-imperative-mood-in-the-subject-line)
490 - [5. Wrap the Body at 72 Characters](#5-wrap-the-body-at-72-characters)
491 - [6. Use the Body to Explain What and Why vs. How](#6-use-the-body-to-explain-what-and-why-vs-how)
4925. [Tools Worth Mentioning](#tools-worth-mentioning)
493 - [Using `--fixup`](#using---fixup)
494 - [Interactive Rebase](#interactive-rebase)
4956. [Pull Request and Squashing Commits Caveats](#pull-request-and-squashing-commits-caveats)
496
497
498### Code style
499
500#### General Style
501Python's official style guide is PEP 8, which provides conventions for writing code for the main Python distribution. Here are some key points:
502
503- `Indentation:` Use 4 spaces per indentation level.
504
505- `Line Length:` Limit all lines to a maximum of 79 characters.
506
507- `Blank Lines:` Surround top-level function and class definitions with two blank lines. Method definitions inside a class are surrounded by a single blank line.
508
509- `Imports:` Imports should usually be on separate lines and should be grouped in the following order:
510
511 - Standard library imports.
512 - Related third party imports.
513 - Local application/library specific imports.
514- `Whitespace:` Avoid extraneous whitespace in the following situations:
515
516 - Immediately inside parentheses, brackets or braces.
517 - Immediately before a comma, semicolon, or colon.
518 - Immediately before the open parenthesis that starts the argument list of a function call.
519- `Comments:` Comments should be complete sentences and should be used to clarify code and are not a substitute for poorly written code.
520
521#### For Python
522
523- `List Comprehensions:` Use list comprehensions for concise and readable creation of lists.
524
525- `Generators:` Use generators when dealing with large amounts of data to save memory.
526
527- `Context Managers:` Use context managers (with statement) for resource management.
528
529- `String Formatting:` Use f-strings for formatting strings in Python 3.6 and above.
530
531- `Error Handling:` Use exceptions for error handling whenever possible.
532
533#### More details
534
535Use `black` to format your python code before commiting for consistency across such a large pool of contributors. Black's code [style](https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#code-style) ensures consistent and opinionated code formatting. It automatically formats your Python code according to the Black style guide, enhancing code readability and maintainability.
536
537Key Features of Black:
538
539 Consistency: Black enforces a single, consistent coding style across your project, eliminating style debates and allowing developers to focus on code logic.
540
541 Readability: By applying a standard formatting style, Black improves code readability, making it easier to understand and collaborate on projects.
542
543 Automation: Black automates the code formatting process, saving time and effort. It eliminates the need for manual formatting and reduces the likelihood of inconsistencies.
544
545### Naming Conventions
546
547- `Classes:` Class names should normally use the CapWords Convention.
548- `Functions and Variables:` Function names should be lowercase, with words separated by underscores as necessary to improve readability. Variable names follow the same convention as function names.
549
550- `Constants:` Constants are usually defined on a module level and written in all capital letters with underscores separating words.
551
552- `Non-public Methods and Instance Variables:` Use a single leading underscore (_). This is a weak "internal use" indicator.
553
554- `Strongly "private" methods and variables:` Use a double leading underscore (__). This triggers name mangling in Python.
555
556
557### Git commit style
558
559Here’s a model Git commit message when contributing:
560```
561Summarize changes in around 50 characters or less
562
563More detailed explanatory text, if necessary. Wrap it to about 72
564characters or so. In some contexts, the first line is treated as the
565subject of the commit and the rest of the text as the body. The
566blank line separating the summary from the body is critical (unless
567you omit the body entirely); various tools like `log`, `shortlog`
568and `rebase` can get confused if you run the two together.
569
570Explain the problem that this commit is solving. Focus on why you
571are making this change as opposed to how (the code explains that).
572Are there side effects or other unintuitive consequences of this
573change? Here's the place to explain them.
574
575Further paragraphs come after blank lines.
576
577 - Bullet points are okay, too
578
579 - Typically a hyphen or asterisk is used for the bullet, preceded
580 by a single space, with blank lines in between, but conventions
581 vary here
582
583If you use an issue tracker, put references to them at the bottom,
584like this:
585
586Resolves: #123
587See also: #456, #789
588```
589
590
591## The six rules of a great commit.
592
593#### 1. Atomic Commits
594An “atomic” change revolves around one task or one fix.
595
596Atomic Approach
597 - Commit each fix or task as a separate change
598 - Only commit when a block of work is complete
599 - Commit each layout change separately
600 - Joint commit for layout file, code behind file, and additional resources
601
602Benefits
603
604- Easy to roll back without affecting other changes
605- Easy to make other changes on the fly
606- Easy to merge features to other branches
607
608#### Avoid trivial commit messages
609
610Commit messages like "fix", "fix2", or "fix3" don't provide any context or clear understanding of what changes the commit introduces. Here are some examples of good vs. bad commit messages:
611
612**Bad Commit Message:**
613
614 $ git commit -m "fix"
615
616**Good Commit Message:**
617
618 $ git commit -m "Fix typo in README file"
619
620> **Caveat**: When working with new features, an atomic commit will often consist of multiple files, since a layout file, code behind file, and additional resources may have been added/modified. You don’t want to commit all of these separately, because if you had to roll back the application to a state before the feature was added, it would involve multiple commit entries, and that can get confusing
621
622#### 2. Separate subject from body with a blank line
623
624Not every commit requires both a subject and a body. Sometimes a single line is fine, especially when the change is so simple that no further context is necessary.
625
626For example:
627
628 Fix typo in introduction to user guide
629
630Nothing more need be said; if the reader wonders what the typo was, she can simply take a look at the change itself, i.e. use git show or git diff or git log -p.
631
632If you’re committing something like this at the command line, it’s easy to use the -m option to git commit:
633
634 $ git commit -m"Fix typo in introduction to user guide"
635
636However, when a commit merits a bit of explanation and context, you need to write a body. For example:
637
638 Derezz the master control program
639
640 MCP turned out to be evil and had become intent on world domination.
641 This commit throws Tron's disc into MCP (causing its deresolution)
642 and turns it back into a chess game.
643
644Commit messages with bodies are not so easy to write with the -m option. You’re better off writing the message in a proper text editor. [See Pro Git](https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration).
645
646In any case, the separation of subject from body pays off when browsing the log. Here’s the full log entry:
647
648 $ git log
649 commit 42e769bdf4894310333942ffc5a15151222a87be
650 Author: Kevin Flynn <[email protected]>
651 Date: Fri Jan 01 00:00:00 1982 -0200
652
653 Derezz the master control program
654
655 MCP turned out to be evil and had become intent on world domination.
656 This commit throws Tron's disc into MCP (causing its deresolution)
657 and turns it back into a chess game.
658
659
660#### 3. Limit the subject line to 50 characters
66150 characters is not a hard limit, just a rule of thumb. Keeping subject lines at this length ensures that they are readable, and forces the author to think for a moment about the most concise way to explain what’s going on.
662
663GitHub’s UI is fully aware of these conventions. It will warn you if you go past the 50 character limit. Git will truncate any subject line longer than 72 characters with an ellipsis, thus keeping it to 50 is best practice.
664
665#### 4. Use the imperative mood in the subject line
666Imperative mood just means “spoken or written as if giving a command or instruction”. A few examples:
667
668 Clean your room
669 Close the door
670 Take out the trash
671
672Each of the seven rules you’re reading about right now are written in the imperative (“Wrap the body at 72 characters”, etc.).
673
674The imperative can sound a little rude; that’s why we don’t often use it. But it’s perfect for Git commit subject lines. One reason for this is that Git itself uses the imperative whenever it creates a commit on your behalf.
675
676For example, the default message created when using git merge reads:
677
678 Merge branch 'myfeature'
679
680And when using git revert:
681
682 Revert "Add the thing with the stuff"
683
684 This reverts commit cc87791524aedd593cff5a74532befe7ab69ce9d.
685
686Or when clicking the “Merge” button on a GitHub pull request:
687
688 Merge pull request #123 from someuser/somebranch
689
690So when you write your commit messages in the imperative, you’re following Git’s own built-in conventions. For example:
691
692 Refactor subsystem X for readability
693 Update getting started documentation
694 Remove deprecated methods
695 Release version 1.0.0
696
697Writing this way can be a little awkward at first. We’re more used to speaking in the indicative mood, which is all about reporting facts. That’s why commit messages often end up reading like this:
698
699 Fixed bug with Y
700 Changing behavior of X
701
702And sometimes commit messages get written as a description of their contents:
703
704 More fixes for broken stuff
705 Sweet new API methods
706
707To remove any confusion, here’s a simple rule to get it right every time.
708
709**A properly formed Git commit subject line should always be able to complete the following sentence:**
710
711 If applied, this commit will <your subject line here>
712
713For example:
714
715 If applied, this commit will refactor subsystem X for readability
716 If applied, this commit will update getting started documentation
717 If applied, this commit will remove deprecated methods
718 If applied, this commit will release version 1.0.0
719 If applied, this commit will merge pull request #123 from user/branch
720
721#### 5. Wrap the body at 72 characters
722Git never wraps text automatically. When you write the body of a commit message, you must mind its right margin, and wrap text manually.
723
724The recommendation is to do this at 72 characters, so that Git has plenty of room to indent text while still keeping everything under 80 characters overall.
725
726A good text editor can help here. It’s easy to configure Vim, for example, to wrap text at 72 characters when you’re writing a Git commit.
727
728#### 6. Use the body to explain what and why vs. how
729This [commit](https://github.com/bitcoin/bitcoin/commit/eb0b56b19017ab5c16c745e6da39c53126924ed6) from Bitcoin Core is a great example of explaining what changed and why:
730
731```
732commit eb0b56b19017ab5c16c745e6da39c53126924ed6
733Author: Pieter Wuille <[email protected]>
734Date: Fri Aug 1 22:57:55 2014 +0200
735
736 Simplify serialize.h's exception handling
737
738 Remove the 'state' and 'exceptmask' from serialize.h's stream
739 implementations, as well as related methods.
740
741 As exceptmask always included 'failbit', and setstate was always
742 called with bits = failbit, all it did was immediately raise an
743 exception. Get rid of those variables, and replace the setstate
744 with direct exception throwing (which also removes some dead
745 code).
746
747 As a result, good() is never reached after a failure (there are
748 only 2 calls, one of which is in tests), and can just be replaced
749 by !eof().
750
751 fail(), clear(n) and exceptions() are just never called. Delete
752 them.
753```
754
755Take a look at the [full diff](https://github.com/bitcoin/bitcoin/commit/eb0b56b19017ab5c16c745e6da39c53126924ed6) and just think how much time the author is saving fellow and future committers by taking the time to provide this context here and now. If he didn’t, it would probably be lost forever.
756
757In most cases, you can leave out details about how a change has been made. Code is generally self-explanatory in this regard (and if the code is so complex that it needs to be explained in prose, that’s what source comments are for). Just focus on making clear the reasons why you made the change in the first place—the way things worked before the change (and what was wrong with that), the way they work now, and why you decided to solve it the way you did.
758
759The future maintainer that thanks you may be yourself!
760
761
762
763#### Tools worth mentioning
764
765##### Using `--fixup`
766
767If you've made a commit and then realize you've missed something or made a minor mistake, you can use the `--fixup` option.
768
769For example, suppose you've made a commit with a hash `9fceb02`. Later, you realize you've left a debug statement in your code. Instead of making a new commit titled "remove debug statement" or "fix", you can do the following:
770
771 $ git commit --fixup 9fceb02
772
773This will create a new commit to fix the issue, with a message like "fixup! The original commit message".
774
775##### Interactive Rebase
776
777Interactive rebase, or `rebase -i`, can be used to squash these fixup commits into the original commits they're fixing, which cleans up your commit history. You can use the `autosquash` option to automatically squash any commits marked as "fixup" into their target commits.
778
779For example:
780
781 $ git rebase -i --autosquash HEAD~5
782
783This command starts an interactive rebase for the last 5 commits (`HEAD~5`). Any commits marked as "fixup" will be automatically moved to squash with their target commits.
784
785The benefit of using `--fixup` and interactive rebase is that it keeps your commit history clean and readable. It groups fixes with the commits they are related to, rather than having a separate "fix" commit that might not make sense to other developers (or even to you) in the future.
786
787
788---
789
790#### Pull Request and Squashing Commits Caveats
791
792While atomic commits are great for development and for understanding the changes within the branch, the commit history can get messy when merging to the main branch. To keep a cleaner and more understandable commit history in our main branch, we encourage squashing all the commits of a PR into one when merging.
793
794This single commit should provide an overview of the changes that the PR introduced. It should follow the guidelines for atomic commits (an atomic commit is complete, self-contained, and understandable) but on the scale of the entire feature, task, or fix that the PR addresses. This approach combines the benefits of atomic commits during development with a clean commit history in our main branch.
795
796Here is how you can squash commits:
797
798```bash
799git rebase -i HEAD~n
800```
801
802where `n` is the number of commits to squash. After running the command, replace `pick` with `squash` for the commits you want to squash into the previous commit. This will combine the commits and allow you to write a new commit message.
803
804In this context, an atomic commit message could look like:
805
806```
807Add feature X
808
809This commit introduces feature X which does A, B, and C. It adds
810new files for layout, updates the code behind the file, and introduces
811new resources. This change is important because it allows users to
812perform task Y more efficiently.
813
814It includes:
815- Creation of new layout file
816- Updates in the code-behind file
817- Addition of new resources
818
819Resolves: #123
820```
821
822In your PRs, remember to detail what the PR is introducing or fixing. This will be helpful for reviewers to understand the context and the reason behind the changes.
823
824
825
826---
827File: /docs/stream_tutorial/client.py
828---
829
830import argparse
831import asyncio
832import bittensor as bt
833
834from protocol import StreamPrompting
835
836"""
837This has assumed you have:
8381. Registered your miner on the chain (finney/test)
8392. Are serving your miner on an open port (e.g. 12345)
840
841Steps:
842- Instantiate your synapse subclass with the relevant information. E.g. messages, roles, etc.
843- Instantiate your wallet and a dendrite client
844- Query the dendrite client with your synapse object
845- Iterate over the async generator to extract the yielded tokens on the server side
846"""
847
848
849async def query_synapse(my_uid, wallet_name, hotkey, network, netuid):
850 syn = StreamPrompting(
851 roles=["user"],
852 messages=[
853 "hello this is a test of a streaming response. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."
854 ],
855 )
856
857 # create a wallet instance with provided wallet name and hotkey
858 wallet = bt.wallet(name=wallet_name, hotkey=hotkey)
859
860 # instantiate the metagraph with provided network and netuid
861 metagraph = bt.metagraph(
862 netuid=netuid, network=network, sync=True, lite=False
863 )
864
865 # Grab the axon you're serving
866 axon = metagraph.axons[my_uid]
867
868 # Create a Dendrite instance to handle client-side communication.
869 dendrite = bt.dendrite(wallet=wallet)
870
871 async def main():
872 responses = await dendrite(
873 [axon], syn, deserialize=False, streaming=True
874 )
875
876 for resp in responses:
877 i = 0
878 async for chunk in resp:
879 i += 1
880 if i % 5 == 0:
881 print()
882 if isinstance(chunk, list):
883 print(chunk[0], end="", flush=True)
884 else:
885 # last object yielded is the synapse itself with completion filled
886 synapse = chunk
887 break
888
889 # Run the main function with asyncio
890 await main()
891
892
893if __name__ == "__main__":
894 parser = argparse.ArgumentParser(
895 description="Query a Bittensor synapse with given parameters."
896 )
897
898 # Adding arguments
899 parser.add_argument(
900 "--my_uid",
901 type=int,
902 required=True,
903 help="Your unique miner ID on the chain",
904 )
905 parser.add_argument(
906 "--netuid", type=int, required=True, help="Network Unique ID"
907 )
908 parser.add_argument(
909 "--wallet_name", type=str, default="default", help="Name of the wallet"
910 )
911 parser.add_argument(
912 "--hotkey", type=str, default="default", help="Hotkey for the wallet"
913 )
914 parser.add_argument(
915 "--network",
916 type=str,
917 default="test",
918 help='Network type, e.g., "test" or "mainnet"',
919 )
920
921 # Parse arguments
922 args = parser.parse_args()
923
924 # Running the async function with provided arguments
925 asyncio.run(
926 query_synapse(
927 args.my_uid,
928 args.wallet_name,
929 args.hotkey,
930 args.network,
931 args.netuid,
932 )
933 )
934
935
936
937---
938File: /docs/stream_tutorial/config.py
939---
940
941import bittensor as bt
942import argparse
943import os
944
945
946def check_config(cls, config: "bt.Config"):
947 bt.axon.check_config(config)
948 bt.logging.check_config(config)
949 full_path = os.path.expanduser(
950 "{}/{}/{}/{}".format(
951 config.logging.logging_dir,
952 config.wallet.get("name", bt.defaults.wallet.name),
953 config.wallet.get("hotkey", bt.defaults.wallet.hotkey),
954 config.miner.name,
955 )
956 )
957 config.miner.full_path = os.path.expanduser(full_path)
958 if not os.path.exists(config.miner.full_path):
959 os.makedirs(config.miner.full_path)
960
961
962def get_config() -> "bt.Config":
963 parser = argparse.ArgumentParser()
964 parser.add_argument(
965 "--axon.port", type=int, default=8098, help="Port to run the axon on."
966 )
967 # Subtensor network to connect to
968 parser.add_argument(
969 "--subtensor.network",
970 default="finney",
971 help="Bittensor network to connect to.",
972 )
973 # Chain endpoint to connect to
974 parser.add_argument(
975 "--subtensor.chain_endpoint",
976 default="wss://entrypoint-finney.opentensor.ai:443",
977 help="Chain endpoint to connect to.",
978 )
979 # Adds override arguments for network and netuid.
980 parser.add_argument(
981 "--netuid", type=int, default=1, help="The chain subnet uid."
982 )
983
984 parser.add_argument(
985 "--miner.root",
986 type=str,
987 help="Trials for this miner go in miner.root / (wallet_cold - wallet_hot) / miner.name ",
988 default="~/.bittensor/miners/",
989 )
990 parser.add_argument(
991 "--miner.name",
992 type=str,
993 help="Trials for this miner go in miner.root / (wallet_cold - wallet_hot) / miner.name ",
994 default="Bittensor Miner",
995 )
996
997 # Run config.
998 parser.add_argument(
999 "--miner.blocks_per_epoch",
1000 type=str,
1001 help="Blocks until the miner repulls the metagraph from the chain",
1002 default=100,
1003 )
1004
1005 # Switches.
1006 parser.add_argument(
1007 "--miner.no_serve",
1008 action="store_true",
1009 help="If True, the miner doesnt serve the axon.",
1010 default=False,
1011 )
1012 parser.add_argument(
1013 "--miner.no_start_axon",
1014 action="store_true",
1015 help="If True, the miner doesnt start the axon.",
1016 default=False,
1017 )
1018
1019 # Mocks.
1020 parser.add_argument(
1021 "--miner.mock_subtensor",
1022 action="store_true",
1023 help="If True, the miner will allow non-registered hotkeys to mine.",
1024 default=False,
1025 )
1026
1027 # Adds subtensor specific arguments i.e. --subtensor.chain_endpoint ... --subtensor.network ...
1028 bt.subtensor.add_args(parser)
1029
1030 # Adds logging specific arguments i.e. --logging.debug ..., --logging.trace .. or --logging.logging_dir ...
1031 bt.logging.add_args(parser)
1032
1033 # Adds wallet specific arguments i.e. --wallet.name ..., --wallet.hotkey ./. or --wallet.path ...
1034 bt.wallet.add_args(parser)
1035
1036 # Adds axon specific arguments i.e. --axon.port ...
1037 bt.axon.add_args(parser)
1038
1039 # Activating the parser to read any command-line inputs.
1040 # To print help message, run python3 template/miner.py --help
1041 config = bt.config(parser)
1042
1043 # Logging captures events for diagnosis or understanding miner's behavior.
1044 config.full_path = os.path.expanduser(
1045 "{}/{}/{}/netuid{}/{}".format(
1046 config.logging.logging_dir,
1047 config.wallet.name,
1048 config.wallet.hotkey,
1049 config.netuid,
1050 "miner",
1051 )
1052 )
1053 # Ensure the directory for logging exists, else create one.
1054 if not os.path.exists(config.full_path):
1055 os.makedirs(config.full_path, exist_ok=True)
1056 return config
1057
1058
1059
1060---
1061File: /docs/stream_tutorial/miner.py
1062---
1063
1064import copy
1065import time
1066import asyncio
1067import argparse
1068import threading
1069import traceback
1070from abc import ABC, abstractmethod
1071from functools import partial
1072from starlette.types import Send
1073
1074import bittensor as bt
1075from transformers import GPT2Tokenizer
1076from typing import List, Dict, Tuple, Union, Callable, Awaitable
1077
1078from protocol import StreamPrompting
1079from config import get_config, check_config
1080
1081
1082class StreamMiner(ABC):
1083 def __init__(self, config=None, axon=None, wallet=None, subtensor=None):
1084 # Setup base config from Miner.config() and merge with subclassed config.
1085 base_config = copy.deepcopy(config or get_config())
1086 self.config = self.config()
1087 self.config.merge(base_config)
1088
1089 check_config(StreamMiner, self.config)
1090 bt.logging.info(self.config) # TODO: duplicate print?
1091
1092 self.prompt_cache: Dict[str, Tuple[str, int]] = {}
1093
1094 # Activating Bittensor's logging with the set configurations.
1095 bt.logging(config=self.config, logging_dir=self.config.full_path)
1096 bt.logging.info("Setting up bittensor objects.")
1097
1098 # Wallet holds cryptographic information, ensuring secure transactions and communication.
1099 self.wallet = wallet or bt.wallet(config=self.config)
1100 bt.logging.info(f"Wallet {self.wallet}")
1101
1102 # subtensor manages the blockchain connection, facilitating interaction with the Bittensor blockchain.
1103 self.subtensor = subtensor or bt.subtensor(config=self.config)
1104 bt.logging.info(f"Subtensor: {self.subtensor}")
1105 bt.logging.info(
1106 f"Running miner for subnet: {self.config.netuid} on network: {self.subtensor.chain_endpoint} with config:"
1107 )
1108
1109 # metagraph provides the network's current state, holding state about other participants in a subnet.
1110 self.metagraph = self.subtensor.metagraph(self.config.netuid)
1111 bt.logging.info(f"Metagraph: {self.metagraph}")
1112
1113 if self.wallet.hotkey.ss58_address not in self.metagraph.hotkeys:
1114 bt.logging.error(
1115 f"\nYour validator: {self.wallet} if not registered to chain connection: {self.subtensor} \nRun btcli register and try again. "
1116 )
1117 exit()
1118 else:
1119 # Each miner gets a unique identity (UID) in the network for differentiation.
1120 self.my_subnet_uid = self.metagraph.hotkeys.index(
1121 self.wallet.hotkey.ss58_address
1122 )
1123 bt.logging.info(f"Running miner on uid: {self.my_subnet_uid}")
1124
1125 # The axon handles request processing, allowing validators to send this process requests.
1126 self.axon = axon or bt.axon(
1127 wallet=self.wallet, port=self.config.axon.port
1128 )
1129 # Attach determiners which functions are called when servicing a request.
1130 bt.logging.info(f"Attaching forward function to axon.")
1131 print(f"Attaching forward function to axon. {self._prompt}")
1132 self.axon.attach(
1133 forward_fn=self._prompt,
1134 )
1135 bt.logging.info(f"Axon created: {self.axon}")
1136
1137 # Instantiate runners
1138 self.should_exit: bool = False
1139 self.is_running: bool = False
1140 self.thread: threading.Thread = None
1141 self.lock = asyncio.Lock()
1142 self.request_timestamps: Dict = {}
1143
1144 @abstractmethod
1145 def config(self) -> "bt.Config":
1146 ...
1147
1148 @classmethod
1149 @abstractmethod
1150 def add_args(cls, parser: argparse.ArgumentParser):
1151 ...
1152
1153 def _prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1154 """
1155 A wrapper method around the `prompt` method that will be defined by the subclass.
1156
1157 This method acts as an intermediary layer to perform pre-processing before calling the
1158 actual `prompt` method implemented in the subclass. Specifically, it checks whether a
1159 prompt is in cache to avoid reprocessing recent requests. If the prompt is not in the
1160 cache, the subclass `prompt` method is called.
1161
1162 Args:
1163 synapse (StreamPrompting): The incoming request object encapsulating the details of the request.
1164
1165 Returns:
1166 StreamPrompting: The response object to be sent back in reply to the incoming request, essentially
1167 the filled synapse request object.
1168
1169 Raises:
1170 ValueError: If the prompt is found in the cache indicating it was sent recently.
1171
1172 Example:
1173 This method is not meant to be called directly but is invoked internally when a request
1174 is received, and it subsequently calls the `prompt` method of the subclass.
1175 """
1176 return self.prompt(synapse)
1177
1178 @abstractmethod
1179 def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1180 """
1181 Abstract method to handle and respond to incoming requests to the miner.
1182
1183 Subclasses should implement this method to define their custom logic for processing and
1184 responding to requests. This method is designed to be overridden, and its behavior will
1185 be dependent on the specific implementation provided in the subclass.
1186
1187 Args:
1188 synapse (StreamPrompting): The incoming request object encapsulating the details
1189 of the request. This must contain `messages` and `roles` as fields.
1190
1191 Returns:
1192 StreamPrompting: The response object that should be sent back in reply to the
1193 incoming request. This is essentially the filled synapse request object.
1194
1195 Example:
1196 class CustomMiner(Miner):
1197 def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1198 # Custom logic to process and respond to the request.
1199 synapse.completion = "The meaning of life is 42."
1200 return synapse
1201 """
1202 ...
1203
1204 def run(self):
1205 """
1206 Runs the miner logic. This method starts the miner's operations, including
1207 listening for incoming requests and periodically updating the miner's knowledge
1208 of the network graph.
1209 """
1210 if not self.subtensor.is_hotkey_registered(
1211 netuid=self.config.netuid,
1212 hotkey_ss58=self.wallet.hotkey.ss58_address,
1213 ):
1214 bt.logging.error(
1215 f"Wallet: {self.wallet} is not registered on netuid {self.config.netuid}"
1216 f"Please register the hotkey using `btcli subnets register` before trying again"
1217 )
1218 exit()
1219
1220 # Serve passes the axon information to the network + netuid we are hosting on.
1221 # This will auto-update if the axon port of external ip have changed.
1222 bt.logging.info(
1223 f"Serving axon {StreamPrompting} on network: {self.config.subtensor.chain_endpoint} with netuid: {self.config.netuid}"
1224 )
1225 self.axon.serve(netuid=self.config.netuid, subtensor=self.subtensor)
1226
1227 # Start starts the miner's axon, making it active on the network.
1228 bt.logging.info(
1229 f"Starting axon server on port: {self.config.axon.port}"
1230 )
1231 self.axon.start()
1232
1233 # --- Run until should_exit = True.
1234 self.last_epoch_block = self.subtensor.get_current_block()
1235 bt.logging.info(f"Miner starting at block: {self.last_epoch_block}")
1236
1237 # This loop maintains the miner's operations until intentionally stopped.
1238 bt.logging.info(f"Starting main loop")
1239 step = 0
1240 try:
1241 while not self.should_exit:
1242 start_epoch = time.time()
1243
1244 # --- Wait until next epoch.
1245 current_block = self.subtensor.get_current_block()
1246 while (
1247 current_block - self.last_epoch_block
1248 < self.config.miner.blocks_per_epoch
1249 ):
1250 # --- Wait for next bloc.
1251 time.sleep(1)
1252 current_block = self.subtensor.get_current_block()
1253
1254 # --- Check if we should exit.
1255 if self.should_exit:
1256 break
1257
1258 # --- Update the metagraph with the latest network state.
1259 self.last_epoch_block = self.subtensor.get_current_block()
1260
1261 metagraph = self.subtensor.metagraph(
1262 netuid=self.config.netuid,
1263 lite=True,
1264 block=self.last_epoch_block,
1265 )
1266 log = (
1267 f"Step:{step} | "
1268 f"Block:{metagraph.block.item()} | "
1269 f"Stake:{metagraph.S[self.my_subnet_uid]} | "
1270 f"Rank:{metagraph.R[self.my_subnet_uid]} | "
1271 f"Trust:{metagraph.T[self.my_subnet_uid]} | "
1272 f"Consensus:{metagraph.C[self.my_subnet_uid] } | "
1273 f"Incentive:{metagraph.I[self.my_subnet_uid]} | "
1274 f"Emission:{metagraph.E[self.my_subnet_uid]}"
1275 )
1276 bt.logging.info(log)
1277
1278 step += 1
1279
1280 # If someone intentionally stops the miner, it'll safely terminate operations.
1281 except KeyboardInterrupt:
1282 self.axon.stop()
1283 bt.logging.success("Miner killed by keyboard interrupt.")
1284 exit()
1285
1286 # In case of unforeseen errors, the miner will log the error and continue operations.
1287 except Exception as e:
1288 bt.logging.error(traceback.format_exc())
1289
1290 def run_in_background_thread(self):
1291 """
1292 Starts the miner's operations in a separate background thread.
1293 This is useful for non-blocking operations.
1294 """
1295 if not self.is_running:
1296 bt.logging.debug("Starting miner in background thread.")
1297 self.should_exit = False
1298 self.thread = threading.Thread(target=self.run, daemon=True)
1299 self.thread.start()
1300 self.is_running = True
1301 bt.logging.debug("Started")
1302
1303 def stop_run_thread(self):
1304 """
1305 Stops the miner's operations that are running in the background thread.
1306 """
1307 if self.is_running:
1308 bt.logging.debug("Stopping miner in background thread.")
1309 self.should_exit = True
1310 self.thread.join(5)
1311 self.is_running = False
1312 bt.logging.debug("Stopped")
1313
1314 def __enter__(self):
1315 """
1316 Starts the miner's operations in a background thread upon entering the context.
1317 This method facilitates the use of the miner in a 'with' statement.
1318 """
1319 self.run_in_background_thread()
1320
1321 def __exit__(self, exc_type, exc_value, traceback):
1322 """
1323 Stops the miner's background operations upon exiting the context.
1324 This method facilitates the use of the miner in a 'with' statement.
1325
1326 Args:
1327 exc_type: The type of the exception that caused the context to be exited.
1328 None if the context was exited without an exception.
1329 exc_value: The instance of the exception that caused the context to be exited.
1330 None if the context was exited without an exception.
1331 traceback: A traceback object encoding the stack trace.
1332 None if the context was exited without an exception.
1333 """
1334 self.stop_run_thread()
1335
1336
1337class StreamingTemplateMiner(StreamMiner):
1338 def config(self) -> "bt.Config":
1339 """
1340 Returns the configuration object specific to this miner.
1341
1342 Implement and extend this method to provide custom configurations for the miner.
1343 Currently, it sets up a basic configuration parser.
1344
1345 Returns:
1346 bt.Config: A configuration object with the miner's operational parameters.
1347 """
1348 parser = argparse.ArgumentParser(description="Streaming Miner Configs")
1349 self.add_args(parser)
1350 return bt.config(parser)
1351
1352 def add_args(cls, parser: argparse.ArgumentParser):
1353 """
1354 Adds custom arguments to the command line parser.
1355
1356 Developers can introduce additional command-line arguments specific to the miner's
1357 functionality in this method. These arguments can then be used to configure the miner's operation.
1358
1359 Args:
1360 parser (argparse.ArgumentParser):
1361 The command line argument parser to which custom arguments should be added.
1362 """
1363 pass
1364
1365 def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1366 """
1367 Generates a streaming response for the provided synapse.
1368
1369 This function serves as the main entry point for handling streaming prompts. It takes
1370 the incoming synapse which contains messages to be processed and returns a streaming
1371 response. The function uses the GPT-2 tokenizer and a simulated model to tokenize and decode
1372 the incoming message, and then sends the response back to the client token by token.
1373
1374 Args:
1375 synapse (StreamPrompting): The incoming StreamPrompting instance containing the messages to be processed.
1376
1377 Returns:
1378 StreamPrompting: The streaming response object which can be used by other functions to
1379 stream back the response to the client.
1380
1381 Usage:
1382 This function can be extended and customized based on specific requirements of the
1383 miner. Developers can swap out the tokenizer, model, or adjust how streaming responses
1384 are generated to suit their specific applications.
1385 """
1386 bt.logging.trace("HI. PROMPT()")
1387 tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
1388
1389 # Simulated function to decode token IDs into strings. In a real-world scenario,
1390 # this can be replaced with an actual model inference step.
1391 def model(ids):
1392 return (tokenizer.decode(id) for id in ids)
1393
1394 async def _prompt(text: str, send: Send):
1395 """
1396 Asynchronously processes the input text and sends back tokens as a streaming response.
1397
1398 This function takes an input text, tokenizes it using the GPT-2 tokenizer, and then
1399 uses the simulated model to decode token IDs into strings. It then sends each token
1400 back to the client as a streaming response, with a delay between tokens to simulate
1401 the effect of real-time streaming.
1402
1403 Args:
1404 text (str): The input text message to be processed.
1405 send (Send): An asynchronous function that allows sending back the streaming response.
1406
1407 Usage:
1408 This function can be adjusted based on the streaming requirements, speed of
1409 response, or the model being used. Developers can also introduce more sophisticated
1410 processing steps or modify how tokens are sent back to the client.
1411 """
1412 bt.logging.trace("HI. _PROMPT()")
1413 input_ids = tokenizer(
1414 text, return_tensors="pt"
1415 ).input_ids.squeeze()
1416 buffer = []
1417 bt.logging.debug(f"Input text: {text}")
1418 bt.logging.debug(f"Input ids: {input_ids}")
1419
1420 N = 3 # Number of tokens to send back to the client at a time
1421 for token in model(input_ids):
1422 bt.logging.trace(f"appending token: {token}")
1423 buffer.append(token)
1424 # If buffer has N tokens, send them back to the client.
1425 if len(buffer) == N:
1426 time.sleep(0.1)
1427 joined_buffer = "".join(buffer)
1428 bt.logging.debug(f"sedning tokens: {joined_buffer}")
1429 await send(
1430 {
1431 "type": "http.response.body",
1432 "body": joined_buffer.encode("utf-8"),
1433 "more_body": True,
1434 }
1435 )
1436 bt.logging.debug(f"Streamed tokens: {joined_buffer}")
1437 buffer = [] # Clear the buffer for next batch of tokens
1438
1439 # Send any remaining tokens in the buffer
1440 if buffer:
1441 joined_buffer = "".join(buffer)
1442 await send(
1443 {
1444 "type": "http.response.body",
1445 "body": joined_buffer.encode("utf-8"),
1446 "more_body": False, # No more tokens to send
1447 }
1448 )
1449 bt.logging.trace(f"Streamed tokens: {joined_buffer}")
1450
1451 message = synapse.messages[0]
1452 bt.logging.trace(f"message in _prompt: {message}")
1453 token_streamer = partial(_prompt, message)
1454 bt.logging.trace(f"token streamer: {token_streamer}")
1455 return synapse.create_streaming_response(token_streamer)
1456
1457
1458# This is the main function, which runs the miner.
1459if __name__ == "__main__":
1460 with StreamingTemplateMiner():
1461 while True:
1462 time.sleep(1)
1463
1464
1465
1466---
1467File: /docs/stream_tutorial/protocol.py
1468---
1469
1470import pydantic
1471import bittensor as bt
1472
1473from abc import ABC, abstractmethod
1474from typing import List, Union, Callable, Awaitable
1475from starlette.responses import StreamingResponse
1476
1477
1478class StreamPrompting(bt.StreamingSynapse):
1479 """
1480 StreamPrompting is a specialized implementation of the `StreamingSynapse` tailored for prompting functionalities within
1481 the Bittensor network. This class is intended to interact with a streaming response that contains a sequence of tokens,
1482 which represent prompts or messages in a certain scenario.
1483
1484 As a developer, when using or extending the `StreamPrompting` class, you should be primarily focused on the structure
1485 and behavior of the prompts you are working with. The class has been designed to seamlessly handle the streaming,
1486 decoding, and accumulation of tokens that represent these prompts.
1487
1488 Attributes:
1489 - `roles` (List[str]): A list of roles involved in the prompting scenario. This could represent different entities
1490 or agents involved in the conversation or use-case. They are immutable to ensure consistent
1491 interaction throughout the lifetime of the object.
1492
1493 - `messages` (List[str]): These represent the actual prompts or messages in the prompting scenario. They are also
1494 immutable to ensure consistent behavior during processing.
1495
1496 - `completion` (str): Stores the processed result of the streaming tokens. As tokens are streamed, decoded, and
1497 processed, they are accumulated in the completion attribute. This represents the "final"
1498 product or result of the streaming process.
1499 - `required_hash_fields` (List[str]): A list of fields that are required for the hash.
1500
1501 Methods:
1502 - `process_streaming_response`: This method asynchronously processes the incoming streaming response by decoding
1503 the tokens and accumulating them in the `completion` attribute.
1504
1505 - `deserialize`: Converts the `completion` attribute into its desired data format, in this case, a string.
1506
1507 - `extract_response_json`: Extracts relevant JSON data from the response, useful for gaining insights on the response's
1508 metadata or for debugging purposes.
1509
1510 Note: While you can directly use the `StreamPrompting` class, it's designed to be extensible. Thus, you can create
1511 subclasses to further customize behavior for specific prompting scenarios or requirements.
1512 """
1513
1514 roles: List[str] = pydantic.Field(
1515 ...,
1516 title="Roles",
1517 description="A list of roles in the StreamPrompting scenario. Immuatable.",
1518 allow_mutation=False,
1519 )
1520
1521 messages: List[str] = pydantic.Field(
1522 ...,
1523 title="Messages",
1524 description="A list of messages in the StreamPrompting scenario. Immutable.",
1525 allow_mutation=False,
1526 )
1527
1528 required_hash_fields: List[str] = pydantic.Field(
1529 ["messages"],
1530 title="Required Hash Fields",
1531 description="A list of required fields for the hash.",
1532 allow_mutation=False,
1533 )
1534
1535 completion: str = pydantic.Field(
1536 "",
1537 title="Completion",
1538 description="Completion status of the current StreamPrompting object. This attribute is mutable and can be updated.",
1539 )
1540
1541 async def process_streaming_response(self, response: StreamingResponse):
1542 """
1543 `process_streaming_response` is an asynchronous method designed to process the incoming streaming response from the
1544 Bittensor network. It's the heart of the StreamPrompting class, ensuring that streaming tokens, which represent
1545 prompts or messages, are decoded and appropriately managed.
1546
1547 As the streaming response is consumed, the tokens are decoded from their 'utf-8' encoded format, split based on
1548 newline characters, and concatenated into the `completion` attribute. This accumulation of decoded tokens in the
1549 `completion` attribute allows for a continuous and coherent accumulation of the streaming content.
1550
1551 Args:
1552 response: The streaming response object containing the content chunks to be processed. Each chunk in this
1553 response is expected to be a set of tokens that can be decoded and split into individual messages or prompts.
1554 """
1555 if self.completion is None:
1556 self.completion = ""
1557 bt.logging.debug(
1558 "Processing streaming response (StreamingSynapse base class)."
1559 )
1560 async for chunk in response.content.iter_any():
1561 bt.logging.debug(f"Processing chunk: {chunk}")
1562 tokens = chunk.decode("utf-8").split("\n")
1563 for token in tokens:
1564 bt.logging.debug(f"--processing token: {token}")
1565 if token:
1566 self.completion += token
1567 bt.logging.debug(f"yielding tokens {tokens}")
1568 yield tokens
1569
1570 def deserialize(self) -> str:
1571 """
1572 Deserializes the response by returning the completion attribute.
1573
1574 Returns:
1575 str: The completion result.
1576 """
1577 return self.completion
1578
1579 def extract_response_json(self, response: StreamingResponse) -> dict:
1580 """
1581 `extract_response_json` is a method that performs the crucial task of extracting pertinent JSON data from the given
1582 response. The method is especially useful when you need a detailed insight into the streaming response's metadata
1583 or when debugging response-related issues.
1584
1585 Beyond just extracting the JSON data, the method also processes and structures the data for easier consumption
1586 and understanding. For instance, it extracts specific headers related to dendrite and axon, offering insights
1587 about the Bittensor network's internal processes. The method ultimately returns a dictionary with a structured
1588 view of the extracted data.
1589
1590 Args:
1591 response: The response object from which to extract the JSON data. This object typically includes headers and
1592 content which can be used to glean insights about the response.
1593
1594 Returns:
1595 dict: A structured dictionary containing:
1596 - Basic response metadata such as name, timeout, total_size, and header_size.
1597 - Dendrite and Axon related information extracted from headers.
1598 - Roles and Messages pertaining to the current StreamPrompting instance.
1599 - The accumulated completion.
1600 """
1601 headers = {
1602 k.decode("utf-8"): v.decode("utf-8")
1603 for k, v in response.__dict__["_raw_headers"]
1604 }
1605
1606 def extract_info(prefix):
1607 return {
1608 key.split("_")[-1]: value
1609 for key, value in headers.items()
1610 if key.startswith(prefix)
1611 }
1612
1613 return {
1614 "name": headers.get("name", ""),
1615 "timeout": float(headers.get("timeout", 0)),
1616 "total_size": int(headers.get("total_size", 0)),
1617 "header_size": int(headers.get("header_size", 0)),
1618 "dendrite": extract_info("bt_header_dendrite"),
1619 "axon": extract_info("bt_header_axon"),
1620 "roles": self.roles,
1621 "messages": self.messages,
1622 "completion": self.completion,
1623 }
1624
1625
1626
1627---
1628File: /docs/stream_tutorial/README.md
1629---
1630
1631# Bittensor Streaming Tutorial
1632This document is intented as a developer-friendly walkthrough of integrating streaming into your bittensor application.
1633
1634If you prefer to jump right into a complete stand-alone example, see:
1635- `miner.py`
1636- `protocol.py`
1637- `client.py`
1638
1639Start your miner:
1640```bash
1641python miner.py --netuid 8 --wallet.name default --wallet.hotkey miner --subtensor.network test --axon.port 10000 --logging.trace
1642```
1643
1644Run the client:
1645```bash
1646python client.py --netuid 8 --my_uid 1 --network test
1647```
1648
1649## Overview
1650This tutorial is designed to show you how to use the streaming API to integrate into your application. It will cover the following topics:
1651- writing your streaming protocol (inherits from bittensor.StreamingSynapse)
1652- writing your streaming server (uses your streaming protocol)
1653- writing your streaming client (uses your streaming protocol)
1654
1655### Defining your streaming protocol
1656When designing your protocol, it would be helpful to look at the bittensor.StreamingSynapse for reference. Below is a condensed snippet of the abstract methods that you will need to implement in your subclass.
1657
1658You will need to implement two methods:
1659
1660- `process_streaming_response`
1661- `extract_response_json`
1662
1663These two methods are the core of your streaming protocol. The first method process_streaming_response is called as the response is being streamed from the network. It is responsible for handling the streaming response, such as parsing and accumulating data. The second method extract_response_json is called after the response has been processed and is responsible for retrieving structured data to be post-processed in the dendrite in bittensor core code.
1664
1665```python
1666class StreamingSynapse(bittensor.Synapse, ABC):
1667 ...
1668 class BTStreamingResponse(_StreamingResponse):
1669 ...
1670 @abstractmethod
1671 async def process_streaming_response(self, response: Response):
1672 """
1673 Abstract method that must be implemented by the subclass.
1674 This method should provide logic to handle the streaming response, such as parsing and accumulating data.
1675 It is called as the response is being streamed from the network, and should be implemented to handle the specific
1676 streaming data format and requirements of the subclass.
1677
1678 Args:
1679 response: The response object to be processed, typically containing chunks of data.
1680 """
1681 ...
1682
1683 @abstractmethod
1684 def extract_response_json(self, response: Response) -> dict:
1685 """
1686 Abstract method that must be implemented by the subclass.
1687 This method should provide logic to extract JSON data from the response, including headers and content.
1688 It is called after the response has been processed and is responsible for retrieving structured data
1689 that can be used by the application.
1690
1691 Args:
1692 response: The response object from which to extract JSON data.
1693 """
1694 ...
1695 ...
1696```
1697
1698See the full reference code at the bittensor [repo](https://github.com/opentensor/bittensor/blob/master/bittensor/stream.py).
1699
1700
1701#### Create your protocol
1702Let's walk through how to create a protocol using the bittensor.StreamingSynapse class.
1703```python
1704class MyStreamingSynapse(bt.StreamingSynapse):
1705 # define your expected data fields here as pydantic field objects
1706 # This allows you to control what information is passed along the network
1707 messages: List[str] = pydantic.Field(
1708 ..., # this ellipsis (...) indicates the object is required
1709 title="Messages", # What is the name of this field?
1710 description="A list of messages in the Prompting scenario. Immutable.",
1711 allow_mutation=False, # disallow modification of this field after creation
1712 )
1713 completion: str = pydantic.Field(
1714 "",
1715 title="Completion",
1716 )
1717 # add fields as necessary
1718 ...
1719
1720 # This method controls how your synapse is deserialized from the network
1721 # E.g. you can extract whatever information you want to receive at the final
1722 # yield in the async generator returned by the server, without receiving
1723 # the entire synapse object itself.
1724 # In this example, we just want the completion string at the end.
1725 def deserialize(self) -> str:
1726 return self.completion
1727
1728 # implement your `process_streaming_response` logic to actually yield objects to the streamer
1729 # this effectively defines the async generator that you'll recieve on the client side
1730 async def process_streaming_response(self, response: MyStreamingSynapse):
1731 # this is an example of how you might process a streaming response
1732 # iterate over the response content and yield each line
1733 async for chunk in response.content.iter_any():
1734 tokens = chunk.decode("utf-8").split("\n")
1735 yield tokens
1736
1737 # implement `extract_response_json` to extract the JSON data from the response headers
1738 # this will be dependent on the data you are streaming and how you want to structure it
1739 # it MUST conform to the following format expected by the bittensor dendrite:
1740 """
1741 {
1742 # METADATA AND HEADERS
1743 "name": ...,
1744 "timeout": float(...),
1745 "total_size": int(...),
1746 "header_size": int(...),
1747 "dendrite": ...,
1748 "axon": ...,
1749 # YOUR FIELDS
1750 "messages": self.messages,
1751 ...
1752 }
1753 """
1754 def extract_response_json(self, response: MyStreamingSynapse) -> dict:
1755 # iterate over the response headers and extract the necessary data
1756 headers = {
1757 k.decode("utf-8"): v.decode("utf-8")
1758 for k, v in response.__dict__["_raw_headers"]
1759 }
1760 # helper function to extract data from headers
1761 def extract_info(prefix):
1762 return {
1763 key.split("_")[-1]: value
1764 for key, value in headers.items()
1765 if key.startswith(prefix)
1766 }
1767 # return the extracted data in the expected format
1768 return {
1769 "name": headers.get("name", ""),
1770 "timeout": float(headers.get("timeout", 0)),
1771 "total_size": int(headers.get("total_size", 0)),
1772 "header_size": int(headers.get("header_size", 0)),
1773 "dendrite": extract_info("bt_header_dendrite"), # dendrite info
1774 "axon": extract_info("bt_header_axon"), # axon info
1775 "messages": self.messages, # field object
1776 }
1777```
1778
1779[Here](https://github.com/opentensor/text-prompting/blob/main/prompting/protocol.py#L131) is a full example implementation of a streaming protocol based on the text-prompting network.
1780
1781Please read the docstrings provided, they can be very helpful!
1782
1783### Writing the server
1784Great! Now we have our protocol defined, let's see how to define our server.
1785This will generate the tokens to be streamed in this prompting example.
1786
1787For brevity we will not be building a full miner, but inspecting the central components.
1788```python
1789class MyStreamPromptingMiner(bt.Miner):
1790 ... # any relevant methods you'd need for your miner
1791
1792 # define your server forward here
1793 # NOTE: It is crucial that your typehints are correct and reflect your streaming protocol object
1794 # otherwise the axon will reject adding your route to the server.
1795 def forward(self, synapse: MyStreamingSynapse) -> MyStreamingSynapse:
1796 # Let's use a GPT2 tokenizer for this toy example
1797 tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
1798
1799 # Simulated function to decode token IDs into strings. In a real-world scenario,
1800 # this can be replaced with an actual model inference step.
1801 def model(ids):
1802 return (tokenizer.decode(id) for id in ids)
1803
1804 # This function is called asynchronously to process the input text and send back tokens
1805 # as a streaming response. It essentially produces the async generator that will be
1806 # consumed by the client with an `async for` loop.
1807 async def _forward(text: str, send: Send):
1808 # `text` may be the input prompt to your model in a real-world scenario.
1809 # let's tokenize them into IDs for the sake of this example.
1810 input_ids = tokenizer(text, return_tensors="pt").input_ids.squeeze()
1811
1812 # You may want to buffer your tokens before sending them back to the client.
1813 # this can be useful so we aren't flooding the client with individual tokens
1814 # and allows you more fine-grained control over how much data is sent back
1815 # with each yield.
1816 N = 3 # Number of tokens to send back to the client at a time
1817 buffer = []
1818 # Iterate over the tokens and send the generationed tokens back to the client
1819 # when we have sufficient (N) tokens in the buffer.
1820 for token in model(input_ids):
1821 buffer.append(token) # Add token to buffer
1822
1823 # If buffer has N tokens, send them back to the client.
1824 if len(buffer) == N:
1825 joined_buffer = "".join(buffer)
1826 # Send the tokens back to the client
1827 # This is the core of the streaming response and the format
1828 # is important. The `send` function is provided by the ASGI server
1829 # and is responsible for sending the response back to the client.
1830 # This buffer will be received by the client as a single chunk of
1831 # data, which can then be split into individual tokens!
1832 await send(
1833 {
1834 "type": "http.response.body",
1835 "body": joined_buffer.encode("utf-8"),
1836 "more_body": True,
1837 }
1838 )
1839 buffer = [] # Clear the buffer for next batch of tokens
1840
1841 # Create a streaming response object using the `_forward` function
1842 # It is useful to wrap your _forward function in a partial function
1843 # to pass in the text argument lazily.
1844 token_streamer = partial(_forward, synapse.messages[0])
1845 # Return the streaming response object, which is an instance of the
1846 # `BTStreamingResponse` class.
1847 return synapse.create_streaming_response(token_streamer)
1848```
1849
1850#### Complete Example
1851Here is a full example for reference:
1852> This inherits from the prompting (text-prompting) miner base class.
1853> Take a look at the `prompting/baseminer/miner.py` file [here](https://github.com/opentensor/text-prompting/blob/main/prompting/baseminer/miner.py) for more details.
1854
1855```python
1856class StreamingTemplateMiner(prompting.Miner):
1857 def config(self) -> "bt.Config":
1858 """
1859 Returns the configuration object specific to this miner.
1860
1861 Implement and extend this method to provide custom configurations for the miner.
1862 Currently, it sets up a basic configuration parser.
1863
1864 Returns:
1865 bt.Config: A configuration object with the miner's operational parameters.
1866 """
1867 parser = argparse.ArgumentParser(description="Streaming Miner Configs")
1868 self.add_args(parser)
1869 return bt.config(parser)
1870
1871 def add_args(cls, parser: argparse.ArgumentParser):
1872 """
1873 Adds custom arguments to the command line parser.
1874
1875 Developers can introduce additional command-line arguments specific to the miner's
1876 functionality in this method. These arguments can then be used to configure the miner's operation.
1877
1878 Args:
1879 parser (argparse.ArgumentParser):
1880 The command line argument parser to which custom arguments should be added.
1881 """
1882 pass
1883
1884 def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1885 """
1886 Generates a streaming response for the provided synapse.
1887
1888 This function serves as the main entry point for handling streaming prompts. It takes
1889 the incoming synapse which contains messages to be processed and returns a streaming
1890 response. The function uses the GPT-2 tokenizer and a simulated model to tokenize and decode
1891 the incoming message, and then sends the response back to the client token by token.
1892
1893 Args:
1894 synapse (StreamPrompting): The incoming StreamPrompting instance containing the messages to be processed.
1895
1896 Returns:
1897 StreamPrompting: The streaming response object which can be used by other functions to
1898 stream back the response to the client.
1899
1900 Usage:
1901 This function can be extended and customized based on specific requirements of the
1902 miner. Developers can swap out the tokenizer, model, or adjust how streaming responses
1903 are generated to suit their specific applications.
1904 """
1905 bt.logging.trace("In outer PROMPT()")
1906 tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
1907
1908 # Simulated function to decode token IDs into strings. In a real-world scenario,
1909 # this can be replaced with an actual model inference step.
1910 def model(ids):
1911 return (tokenizer.decode(id) for id in ids)
1912
1913 async def _prompt(text: str, send: Send):
1914 """
1915 Asynchronously processes the input text and sends back tokens as a streaming response.
1916
1917 This function takes an input text, tokenizes it using the GPT-2 tokenizer, and then
1918 uses the simulated model to decode token IDs into strings. It then sends each token
1919 back to the client as a streaming response, with a delay between tokens to simulate
1920 the effect of real-time streaming.
1921
1922 Args:
1923 text (str): The input text message to be processed.
1924 send (Send): An asynchronous function that allows sending back the streaming response.
1925
1926 Usage:
1927 This function can be adjusted based on the streaming requirements, speed of
1928 response, or the model being used. Developers can also introduce more sophisticated
1929 processing steps or modify how tokens are sent back to the client.
1930 """
1931 bt.logging.trace("In inner _PROMPT()")
1932 input_ids = tokenizer(text, return_tensors="pt").input_ids.squeeze()
1933 buffer = []
1934 bt.logging.debug(f"Input text: {text}")
1935 bt.logging.debug(f"Input ids: {input_ids}")
1936
1937 N = 3 # Number of tokens to send back to the client at a time
1938 for token in model(input_ids):
1939 bt.logging.trace(f"appending token: {token}")
1940 buffer.append(token)
1941 # If buffer has N tokens, send them back to the client.
1942 if len(buffer) == N:
1943 time.sleep(0.1)
1944 joined_buffer = "".join(buffer)
1945 bt.logging.debug(f"sedning tokens: {joined_buffer}")
1946 await send(
1947 {
1948 "type": "http.response.body",
1949 "body": joined_buffer.encode("utf-8"),
1950 "more_body": True,
1951 }
1952 )
1953 bt.logging.debug(f"Streamed tokens: {joined_buffer}")
1954 buffer = [] # Clear the buffer for next batch of tokens
1955
1956 # Send any remaining tokens in the buffer
1957 if buffer:
1958 joined_buffer = "".join(buffer)
1959 await send(
1960 {
1961 "type": "http.response.body",
1962 "body": joined_buffer.encode("utf-8"),
1963 "more_body": False, # No more tokens to send
1964 }
1965 )
1966 bt.logging.trace(f"Streamed tokens: {joined_buffer}")
1967
1968 message = synapse.messages[0]
1969 bt.logging.trace(f"message in _prompt: {message}")
1970 token_streamer = partial(_prompt, message)
1971 bt.logging.trace(f"token streamer: {token_streamer}")
1972 return synapse.create_streaming_response(token_streamer)
1973```
1974
1975### Writing the client
1976Excellent! Now we have defined our server, now we can define our client.
1977
1978This has assumed you have:
19791. Registered your miner on the chain (`finney`/`test`)
19802. Are serving your miner on an open port (e.g. `12345`)
1981
1982Steps:
1983- Instantiate your synapse subclass with the relevant information. E.g. `messages`, `roles`, etc.
1984- Instantiate your wallet and a dendrite client
1985- Query the dendrite client with your synapse object
1986- Iterate over the async generator to extract the yielded tokens on the server side
1987
1988```python
1989
1990# Import bittensor
1991import bittensor as bt
1992
1993# Create your streaming synapse subclass object to house the request body
1994syn = MyStreamingSynapse(
1995 roles=["user"],
1996 messages=["hello this is a test of a streaming response. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."]
1997)
1998
1999# Create a wallet instance that must be registered on the network
2000wallet = bt.wallet(name="default", hotkey="default")
2001
2002# Instantiate the metagraph
2003metagraph = bt.metagraph(
2004 netuid=8, network="test", sync=True, lite=False
2005)
2006
2007# Grab the axon you're serving
2008my_uid = 1
2009axon = metagraph.axons[my_uid]
2010
2011# Create a Dendrite instance to handle client-side communication.
2012dendrite = bt.dendrite(wallet=wallet)
2013
2014
2015This is an async function so we can use the `await` keyword when querying the server with the dendrite object.
2016async def main():
2017 # Send a request to the Axon using the Dendrite, passing in a StreamPrompting
2018 # instance with roles and messages. The response is awaited, as the Dendrite
2019 # communicates asynchronously with the Axon. Returns a list of async generator.
2020 responses = await dendrite(
2021 [axon],
2022 syn,
2023 deserialize=False,
2024 streaming=True
2025 )
2026
2027 # Now that we have our responses we want to iterate over the yielded tokens
2028 # iterate over the async generator to extract the yielded tokens on server side
2029 for resp in responses:
2030 i=0
2031 async for chunk in resp:
2032 i += 1
2033 if i % 5 == 0:
2034 print()
2035 if isinstance(chunk, list):
2036 print(chunk[0], end="", flush=True)
2037 else:
2038 # last object yielded is the synapse itself with completion filled
2039 synapse = chunk
2040 break
2041
2042 # The synapse object contains the completion attribute which contains the
2043 # accumulated tokens from the streaming response.
2044
2045if __name__ == "__main__":
2046 # Run the main function with asyncio
2047 asyncio.run(main())
2048
2049```
2050There you have it!
2051
2052### Complete example
2053If you would like to see a complete standalone example that only depends on bittensor>=6.2.0, look below:
2054
2055- client.py
2056- streaming_miner.py
2057-
2058
2059# client.py
2060```python
2061# Import bittensor and the text-prompting packages
2062import bittensor as bt
2063import prompting
2064
2065# Create a StreamPrompting synapse object to house the request body
2066syn = prompting.protocol.StreamPrompting(
2067 roles=["user"],
2068 messages=["hello this is a test of a streaming response. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."])
2069syn
2070
2071# create a wallet instance that must be registered on the network
2072wallet = bt.wallet(name="default", hotkey="default")
2073wallet
2074
2075# instantiate the metagraph
2076metagraph = bt.metagraph(
2077 netuid=8, network="test", sync=True, lite=False
2078)
2079metagraph
2080
2081# Grab the axon you're serving
2082axon = metagraph.axons[62]
2083axon
2084
2085# Create a Dendrite instance to handle client-side communication.
2086d = bt.dendrite(wallet=wallet)
2087d
2088
2089
2090async def main():
2091
2092 # Send a request to the Axon using the Dendrite, passing in a StreamPrompting
2093 # instance with roles and messages. The response is awaited, as the Dendrite
2094 # communicates asynchronously with the Axon. Returns a list of async generator.
2095 responses = await d(
2096 [axon],
2097 syn,
2098 deserialize=False,
2099 streaming=True
2100 )
2101 responses
2102
2103 # iterate over the async generator to extract the yielded tokens on server side
2104 for resp in responses:
2105 i=0
2106 async for chunk in resp:
2107 i += 1
2108 if i % 5 == 0:
2109 print()
2110 if isinstance(chunk, list):
2111 print(chunk[0], end="", flush=True)
2112 else:
2113 # last object yielded is the synapse itself with completion filled
2114 synapse = chunk
2115 break
2116
2117if __name__ == "__main__":
2118 import asyncio
2119 asyncio.run(main())
2120```
2121
2122
2123
2124---
2125File: /docs/running_on_mainnet.md
2126---
2127
2128# Running Subnet on Mainnet
2129
2130This tutorial shows how to use the bittensor `btcli` to create a subnetwork and connect your incentive mechanism to it.
2131
2132**IMPORTANT:** Before attempting to register on mainnet, we strongly recommend that you:
2133- First run [Running Subnet Locally](running_on_staging.md), and
2134- Then run [Running on the Testnet](running_on_testnet.md).
2135
2136Your incentive mechanisms running on the mainnet are open to anyone. They emit real TAO. Creating these mechanisms incur a `lock_cost` in TAO.
2137
2138**DANGER**
2139- Do not expose your private keys.
2140- Only use your testnet wallet.
2141- Do not reuse the password of your mainnet wallet.
2142- Make sure your incentive mechanism is resistant to abuse.
2143
2144## Prerequisites
2145
2146Before proceeding further, make sure that you have installed Bittensor. See the below instructions:
2147
2148- [Install `bittensor`](https://github.com/opentensor/bittensor#install).
2149
2150After installing `bittensor`, proceed as below:
2151
2152## Steps
2153
2154## 1. Install your subnet template
2155
2156**NOTE: Skip this step if** you already did this during local testing and development.
2157
2158In your project directory:
2159
2160```bash
2161git clone https://github.com/opentensor/bittensor-subnet-template.git
2162```
2163
2164Next, `cd` into `bittensor-subnet-template` repo directory:
2165
2166```bash
2167cd bittensor-subnet-template
2168```
2169
2170Install the Bittensor subnet template package:
2171
2172```bash
2173python -m pip install -e . # Install your subnet template package
2174```
2175
2176## 2. Create wallets
2177
2178Create wallets for subnet owner, subnet validator and for subnet miner.
2179
2180This step creates local coldkey and hotkey pairs for your three identities: subnet owner, subnet validator and subnet miner.
2181
2182The owner will create and control the subnet. The owner must have at least 100 TAO before the owner can run next steps.
2183
2184The validator and miner will be registered to the subnet created by the owner. This ensures that the validator and miner can run the respective validator and miner scripts.
2185
2186**NOTE**: You can also use existing wallets to register. Creating new keys is shown here for reference.
2187
2188Create a coldkey for the owner wallet:
2189
2190```bash
2191btcli wallet new_coldkey --wallet.name owner
2192```
2193
2194Create a coldkey and hotkey for the subnet miner wallet:
2195```bash
2196btcli wallet new_coldkey --wallet.name miner
2197```
2198
2199and
2200
2201```bash
2202btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default
2203```
2204
2205Create a coldkey and hotkey for the subnet validator wallet:
2206
2207```bash
2208btcli wallet new_coldkey --wallet.name validator
2209```
2210
2211and
2212
2213```bash
2214btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default
2215```
2216
2217## 3. Getting the price of subnet creation
2218
2219Creating subnets on mainnet is competitive. The cost is determined by the rate at which new subnets are being registered onto the Bittensor blockchain.
2220
2221By default you must have at least 100 TAO on your owner wallet to create a subnet. However, the exact amount will fluctuate based on demand. The below code shows how to get the current price of creating a subnet.
2222
2223```bash
2224btcli subnet lock_cost
2225```
2226
2227The above command will show:
2228
2229```bash
2230>> Subnet lock cost: τ100.000000000
2231```
2232
2233## 4. Purchasing a slot
2234
2235Using your TAO balance, you can register your subnet to the mainchain. This will create a new subnet on the mainchain and give you the owner permissions to it. The below command shows how to purchase a slot.
2236
2237**NOTE**: Slots cost TAO to lock. You will get this TAO back when the subnet is deregistered.
2238
2239```bash
2240btcli subnet create
2241```
2242
2243Enter the owner wallet name. This gives permissions to the coldkey.
2244
2245```bash
2246>> Enter wallet name (default): owner # Enter your owner wallet name
2247>> Enter password to unlock key: # Enter your wallet password.
2248>> Register subnet? [y/n]: <y/n> # Select yes (y)
2249>> ⠇ 📡 Registering subnet...
2250✅ Registered subnetwork with netuid: 1 # Your subnet netuid will show here, save this for later.
2251```
2252
2253## 5. (Optional) Register keys
2254
2255**NOTE**: While this is not enforced, we recommend subnet owners to run a subnet validator and a subnet miner on the subnet to demonstrate proper use to the community.
2256
2257This step registers your subnet validator and subnet miner keys to the subnet giving them the **first two slots** on the subnet.
2258
2259Register your miner key to the subnet:
2260
2261```bash
2262btcli subnet recycle_register --netuid 1 --subtensor.network finney --wallet.name miner --wallet.hotkey default
2263```
2264
2265Follow the below prompts:
2266
2267```bash
2268>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2269>> Continue Registration?
2270 hotkey: ...
2271 coldkey: ...
2272 network: finney [y/n]: # Select yes (y)
2273>> ✅ Registered
2274```
2275
2276Next, register your validator key to the subnet:
2277
2278```bash
2279btcli subnet recycle_register --netuid 1 --subtensor.network finney --wallet.name validator --wallet.hotkey default
2280```
2281
2282Follow the below prompts:
2283
2284```bash
2285>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2286>> Continue Registration?
2287 hotkey: ...
2288 coldkey: ...
2289 network: finney [y/n]: # Select yes (y)
2290>> ✅ Registered
2291```
2292
2293## 6. Check that your keys have been registered
2294
2295Check that your subnet validator key has been registered:
2296
2297```bash
2298btcli wallet overview --wallet.name validator
2299```
2300
2301The output will be similar to the below:
2302
2303```bash
2304Subnet: 1
2305COLDKEY HOTKEY UID ACTIVE STAKE(τ) RANK TRUST CONSENSUS INCENTIVE DIVIDENDS EMISSION(ρ) VTRUST VPERMIT UPDATED AXON HOTKEY_SS58
2306miner default 0 True 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 0.00000 14 none 5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
23071 1 2 τ0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ρ0 0.00000
2308 Wallet balance: τ0.0
2309```
2310
2311Check that your subnet miner has been registered:
2312
2313```bash
2314btcli wallet overview --wallet.name miner
2315```
2316
2317The output will be similar to the below:
2318
2319```bash
2320Subnet: 1
2321COLDKEY HOTKEY UID ACTIVE STAKE(τ) RANK TRUST CONSENSUS INCENTIVE DIVIDENDS EMISSION(ρ) VTRUST VPERMIT UPDATED AXON HOTKEY_SS58
2322miner default 1 True 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 0.00000 14 none 5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
23231 1 2 τ0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ρ0 0.00000
2324 Wallet balance: τ0.0
2325```
2326
2327## 7. Run subnet miner and subnet validator
2328
2329Run the subnet miner:
2330
2331```bash
2332python neurons/miner.py --netuid 1 --wallet.name miner --wallet.hotkey default --logging.debug
2333```
2334
2335You will see the below terminal output:
2336
2337```bash
2338>> 2023-08-08 16:58:11.223 | INFO | Running miner for subnet: 1 on network: wss://entrypoint-finney.opentensor.ai:443 with config: ...
2339```
2340
2341Run the subnet validator:
2342
2343```bash
2344python neurons/validator.py --netuid 1 --wallet.name validator --wallet.hotkey default --logging.debug
2345```
2346
2347You will see the below terminal output:
2348
2349```bash
2350>> 2023-08-08 16:58:11.223 | INFO | Running validator for subnet: 1 on network: wss://entrypoint-finney.opentensor.ai:443 with config: ...
2351```
2352
2353## 8. Get emissions flowing
2354
2355Register to the root subnet using the `btcli`:
2356
2357```bash
2358btcli root register
2359```
2360
2361Then set your weights for the subnet:
2362
2363```bash
2364btcli root weights
2365```
2366
2367## 9. Stopping your nodes
2368
2369To stop your nodes, press CTRL + C in the terminal where the nodes are running.
2370
2371---
2372
2373
2374---
2375File: /docs/running_on_staging.md
2376---
2377
2378# Running Subnet Locally
2379
2380This tutorial will guide you through:
2381
2382- Setting up a local blockchain that is not connected to either Bittensor testchain or mainchain
2383- Creating a subnet
2384- Run your incentive mechanism on the subnet.
2385
2386## Local blockchain vs local subtensor node
2387
2388Running a local blockchain is sometimes synonymously referred as running on staging. This is **different** from running a local subtensor node that connects to the Bittensor mainchain.
2389
2390A local subtensor node will connect to the mainchain and sync with the mainchain, giving you your own access point to the mainchain.
2391
2392Running a local blockchain spins up two authority nodes locally, not connected to any other nodes or testchain or mainchain. This tutorial is for running a local blockchain.
2393
2394## Prerequisites
2395
2396Before proceeding further, make sure that you have installed Bittensor. See the below instructions:
2397
2398- [Install `bittensor`](https://github.com/opentensor/bittensor#install).
2399
2400After installing `bittensor`, proceed as below:
2401
2402## 1. Install Substrate dependencies
2403
2404Begin by installing the required dependencies for running a Substrate node.
2405
2406Update your system packages:
2407
2408```bash
2409sudo apt update
2410```
2411
2412Install additional required libraries and tools
2413
2414```bash
2415sudo apt install --assume-yes make build-essential git clang curl libssl-dev llvm libudev-dev protobuf-compiler
2416```
2417
2418## 2. Install Rust and Cargo
2419
2420Rust is the programming language used in Substrate development. Cargo is Rust package manager.
2421
2422Install rust and cargo:
2423
2424```bash
2425curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
2426```
2427
2428Update your shell's source to include Cargo's path:
2429
2430```bash
2431source "$HOME/.cargo/env"
2432```
2433
2434## 3. Clone the subtensor repository
2435
2436This step fetches the subtensor codebase to your local machine.
2437
2438```bash
2439git clone https://github.com/opentensor/subtensor.git
2440```
2441
2442## 4. Setup Rust
2443
2444This step ensures that you have the nightly toolchain and the WebAssembly (wasm) compilation target. Note that this step will run the subtensor chain on your terminal directly, hence we advise that you run this as a background process using PM2 or other software.
2445
2446Update to the nightly version of Rust:
2447
2448```bash
2449./subtensor/scripts/init.sh
2450```
2451
2452## 5. Initialize
2453
2454These steps initialize your local subtensor chain in development mode. These commands will set up and run a local subtensor.
2455
2456Build the binary with the faucet feature enabled:
2457
2458```bash
2459cargo build --release --features pow-faucet
2460```
2461
2462**NOTE**: The `--features pow-faucet` option in the above is required if we want to use the command `btcli wallet faucet` [See the below Mint tokens step](#8-mint-tokens-from-faucet).
2463
2464Next, run the localnet script and turn off the attempt to build the binary (as we have already done this above):
2465
2466```bash
2467BUILD_BINARY=0 ./scripts/localnet.sh
2468```
2469
2470**NOTE**: Watch for any build or initialization outputs in this step. If you are building the project for the first time, this step will take a while to finish building, depending on your hardware.
2471
2472## 6. Install subnet template
2473
2474`cd` to your project directory and clone the bittensor subnet template repository:
2475
2476```bash
2477git clone https://github.com/opentensor/bittensor-subnet-template.git
2478```
2479
2480Navigate to the cloned repository:
2481
2482```bash
2483cd bittensor-subnet-template
2484```
2485
2486Install the bittensor-subnet-template Python package:
2487
2488```bash
2489python -m pip install -e .
2490```
2491
2492## 7. Set up wallets
2493
2494You will need wallets for the different roles, i.e., subnet owner, subnet validator and subnet miner, in the subnet.
2495
2496- The owner wallet creates and controls the subnet.
2497- The validator and miner will be registered to the subnet created by the owner. This ensures that the validator and miner can run the respective validator and miner scripts.
2498
2499Create a coldkey for the owner role:
2500
2501```bash
2502btcli wallet new_coldkey --wallet.name owner
2503```
2504
2505Set up the miner's wallets:
2506
2507```bash
2508btcli wallet new_coldkey --wallet.name miner
2509```
2510
2511```bash
2512btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default
2513```
2514
2515Set up the validator's wallets:
2516
2517```bash
2518btcli wallet new_coldkey --wallet.name validator
2519```
2520```bash
2521btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default
2522```
2523
2524## 8. Mint tokens from faucet
2525
2526You will need tokens to initialize the intentive mechanism on the chain as well as for registering the subnet.
2527
2528Run the following commands to mint faucet tokens for the owner and for the validator.
2529
2530Mint faucet tokens for the owner:
2531
2532```bash
2533btcli wallet faucet --wallet.name owner --subtensor.chain_endpoint ws://127.0.0.1:9946
2534```
2535
2536You will see:
2537
2538```bash
2539>> Balance: τ0.000000000 ➡ τ100.000000000
2540```
2541
2542Mint tokens for the validator:
2543
2544```bash
2545btcli wallet faucet --wallet.name validator --subtensor.chain_endpoint ws://127.0.0.1:9946
2546```
2547
2548You will see:
2549
2550```bash
2551>> Balance: τ0.000000000 ➡ τ100.000000000
2552```
2553
2554## 9. Create a subnet
2555
2556The below commands establish a new subnet on the local chain. The cost will be exactly τ1000.000000000 for the first subnet you create and you'll have to run the faucet several times to get enough tokens.
2557
2558```bash
2559btcli subnet create --wallet.name owner --subtensor.chain_endpoint ws://127.0.0.1:9946
2560```
2561
2562You will see:
2563
2564```bash
2565>> Your balance is: τ200.000000000
2566>> Do you want to register a subnet for τ1000.000000000? [y/n]:
2567>> Enter password to unlock key: [YOUR_PASSWORD]
2568>> ✅ Registered subnetwork with netuid: 1
2569```
2570
2571**NOTE**: The local chain will now have a default `netuid` of 1. The second registration will create a `netuid` 2 and so on, until you reach the subnet limit of 8. If you register more than 8 subnets, then a subnet with the least staked TAO will be replaced by the 9th subnet you register.
2572
2573## 10. Register keys
2574
2575Register your subnet validator and subnet miner on the subnet. This gives your two keys unique slots on the subnet. The subnet has a current limit of 128 slots.
2576
2577Register the subnet miner:
2578
2579```bash
2580btcli subnet register --wallet.name miner --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2581```
2582
2583Follow the below prompts:
2584
2585```bash
2586>> Enter netuid [1] (1): 1
2587>> Continue Registration? [y/n]: y
2588>> ✅ Registered
2589```
2590
2591Register the subnet validator:
2592
2593```bash
2594
2595btcli subnet register --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2596```
2597
2598Follow the below prompts:
2599
2600```
2601>> Enter netuid [1] (1): 1
2602>> Continue Registration? [y/n]: y
2603>> ✅ Registered
2604```
2605
2606## 11. Add stake
2607
2608This step bootstraps the incentives on your new subnet by adding stake into its incentive mechanism.
2609
2610```bash
2611btcli stake add --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2612```
2613
2614Follow the below prompts:
2615
2616```bash
2617>> Stake all Tao from account: 'validator'? [y/n]: y
2618>> Stake:
2619 τ0.000000000 ➡ τ100.000000000
2620```
2621
2622## 12. Validate key registrations
2623
2624Verify that both the miner and validator keys are successfully registered:
2625
2626```bash
2627btcli subnet list --subtensor.chain_endpoint ws://127.0.0.1:9946
2628```
2629
2630You will see the `2` entry under `NEURONS` column for the `NETUID` of 1, indicating that you have registered a validator and a miner in this subnet:
2631
2632```bash
2633NETUID NEURONS MAX_N DIFFICULTY TEMPO CON_REQ EMISSION BURN(τ)
2634 1 2 256.00 10.00 M 1000 None 0.00% τ1.00000
2635 2 128
2636```
2637
2638See the subnet validator's registered details:
2639
2640```bash
2641btcli wallet overview --wallet.name validator --subtensor.chain_endpoint ws://127.0.0.1:9946
2642```
2643
2644You will see:
2645
2646```
2647Subnet: 1
2648COLDKEY HOTKEY UID ACTIVE STAKE(τ) RANK TRUST CONSENSUS INCENTIVE DIVIDENDS EMISSION(ρ) VTRUST VPERMIT UPDATED AXON HOTKEY_SS58
2649miner default 0 True 100.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 0.00000 14 none 5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
26501 1 2 τ100.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ρ0 0.00000
2651 Wallet balance: τ0.0
2652```
2653
2654See the subnet miner's registered details:
2655
2656```bash
2657btcli wallet overview --wallet.name miner --subtensor.chain_endpoint ws://127.0.0.1:9946
2658```
2659
2660You will see:
2661
2662```bash
2663Subnet: 1
2664COLDKEY HOTKEY UID ACTIVE STAKE(τ) RANK TRUST CONSENSUS INCENTIVE DIVIDENDS EMISSION(ρ) VTRUST VPERMIT UPDATED AXON HOTKEY_SS58
2665miner default 1 True 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 0.00000 14 none 5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
26661 1 2 τ0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ρ0 0.00000
2667 Wallet balance: τ0.0
2668
2669```
2670
2671## 13. Run subnet miner and subnet validator
2672
2673Run the subnet miner and subnet validator. Make sure to specify your subnet parameters.
2674
2675Run the subnet miner:
2676
2677```bash
2678python neurons/miner.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name miner --wallet.hotkey default --logging.debug
2679```
2680
2681Run the subnet validator:
2682
2683```bash
2684python neurons/validator.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name validator --wallet.hotkey default --logging.debug
2685```
2686
2687## 14. Set weights for your subnet
2688
2689Register a validator on the root subnet and boost to set weights for your subnet. This is a necessary step to ensure that the subnet is able to receive emmissions.
2690
2691### Register your validator on the root subnet
2692
2693```bash
2694btcli root register --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2695```
2696
2697### Boost your subnet on the root subnet
2698```bash
2699btcli root boost --netuid 1 --increase 1 --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2700```
2701
2702## 15. Verify your incentive mechanism
2703
2704After a few blocks the subnet validator will set weights. This indicates that the incentive mechanism is active. Then after a subnet tempo elapses (360 blocks or 72 minutes) you will see your incentive mechanism beginning to distribute TAO to the subnet miner.
2705
2706```bash
2707btcli wallet overview --wallet.name miner --subtensor.chain_endpoint ws://127.0.0.1:9946
2708```
2709
2710## Ending your session
2711
2712To halt your nodes:
2713```bash
2714# Press CTRL + C keys in the terminal.
2715```
2716
2717---
2718
2719
2720
2721---
2722File: /docs/running_on_testnet.md
2723---
2724
2725# Running Subnet on Testnet
2726
2727This tutorial shows how to use the Bittensor testnet to create a subnet and run your incentive mechanism on it.
2728
2729**IMPORTANT:** We strongly recommend that you first run [Running Subnet Locally](running_on_staging.md) before running on the testnet. Incentive mechanisms running on the testnet are open to anyone, and although these mechanisms on testnet do not emit real TAO, they cost you test TAO which you must create.
2730
2731**DANGER**
2732- Do not expose your private keys.
2733- Only use your testnet wallet.
2734- Do not reuse the password of your mainnet wallet.
2735- Make sure your incentive mechanism is resistant to abuse.
2736
2737## Prerequisites
2738
2739Before proceeding further, make sure that you have installed Bittensor. See the below instructions:
2740
2741- [Install `bittensor`](https://github.com/opentensor/bittensor#install).
2742
2743After installing `bittensor`, proceed as below:
2744
2745## 1. Install Bittensor subnet template
2746
2747**NOTE: Skip this step if** you already did this during local testing and development.
2748
2749`cd` into your project directory and clone the bittensor-subnet-template repo:
2750
2751```bash
2752git clone https://github.com/opentensor/bittensor-subnet-template.git
2753```
2754
2755Next, `cd` into bittensor-subnet-template repo directory:
2756
2757```bash
2758cd bittensor-subnet-template # Enter the
2759```
2760
2761Install the bittensor-subnet-template package:
2762
2763```bash
2764python -m pip install -e .
2765```
2766
2767## 2. Create wallets
2768
2769Create wallets for subnet owner, subnet validator and for subnet miner.
2770
2771This step creates local coldkey and hotkey pairs for your three identities: subnet owner, subnet validator and subnet miner.
2772
2773The owner will create and control the subnet. The owner must have at least 100 testnet TAO before the owner can run next steps.
2774
2775The validator and miner will be registered to the subnet created by the owner. This ensures that the validator and miner can run the respective validator and miner scripts.
2776
2777Create a coldkey for your owner wallet:
2778
2779```bash
2780btcli wallet new_coldkey --wallet.name owner
2781```
2782
2783Create a coldkey and hotkey for your miner wallet:
2784
2785```bash
2786btcli wallet new_coldkey --wallet.name miner
2787```
2788
2789and
2790
2791```bash
2792btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default
2793```
2794
2795Create a coldkey and hotkey for your validator wallet:
2796
2797```bash
2798btcli wallet new_coldkey --wallet.name validator
2799```
2800
2801and
2802
2803```bash
2804btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default
2805```
2806
2807## 3. Get the price of subnet creation
2808
2809Creating subnets on the testnet is competitive. The cost is determined by the rate at which new subnets are being registered onto the chain.
2810
2811By default you must have at least 100 testnet TAO in your owner wallet to create a subnet. However, the exact amount will fluctuate based on demand. The below command shows how to get the current price of creating a subnet.
2812
2813```bash
2814btcli subnet lock_cost --subtensor.network test
2815```
2816
2817The above command will show:
2818
2819```bash
2820>> Subnet lock cost: τ100.000000000
2821```
2822
2823## 4. (Optional) Get faucet tokens
2824
2825Faucet is disabled on the testnet. Hence, if you don't have sufficient faucet tokens, ask the [Bittensor Discord community](https://discord.com/channels/799672011265015819/830068283314929684) for faucet tokens.
2826
2827## 5. Purchase a slot
2828
2829Using the test TAO from the previous step you can register your subnet on the testnet. This will create a new subnet on the testnet and give you the owner permissions to it.
2830
2831The below command shows how to purchase a slot.
2832
2833**NOTE**: Slots cost TAO to lock. You will get this TAO back when the subnet is deregistered.
2834
2835```bash
2836btcli subnet create --subtensor.network test
2837```
2838
2839Enter the owner wallet name which gives permissions to the coldkey:
2840
2841```bash
2842>> Enter wallet name (default): owner # Enter your owner wallet name
2843>> Enter password to unlock key: # Enter your wallet password.
2844>> Register subnet? [y/n]: <y/n> # Select yes (y)
2845>> ⠇ 📡 Registering subnet...
2846✅ Registered subnetwork with netuid: 1 # Your subnet netuid will show here, save this for later.
2847```
2848
2849## 6. Register keys
2850
2851This step registers your subnet validator and subnet miner keys to the subnet, giving them the **first two slots** on the subnet.
2852
2853Register your miner key to the subnet:
2854
2855```bash
2856btcli subnet recycle_register --netuid 13 --subtensor.network test --wallet.name miner --wallet.hotkey default
2857```
2858
2859Follow the below prompts:
2860
2861```bash
2862>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2863>> Continue Registration?
2864 hotkey: ...
2865 coldkey: ...
2866 network: finney [y/n]: # Select yes (y)
2867>> ✅ Registered
2868```
2869
2870Next, register your validator key to the subnet:
2871
2872```bash
2873btcli subnet recycle_register --netuid 13 --subtensor.network test --wallet.name validator --wallet.hotkey default
2874```
2875
2876Follow the prompts:
2877
2878```bash
2879>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2880>> Continue Registration?
2881 hotkey: ...
2882 coldkey: ...
2883 network: finney [y/n]: # Select yes (y)
2884>> ✅ Registered
2885```
2886
2887## 7. Check that your keys have been registered
2888
2889This step returns information about your registered keys.
2890
2891Check that your validator key has been registered:
2892
2893```bash
2894btcli wallet overview --wallet.name validator --subtensor.network test
2895```
2896
2897The above command will display the below:
2898
2899```bash
2900Subnet: 1
2901COLDKEY HOTKEY UID ACTIVE STAKE(τ) RANK TRUST CONSENSUS INCENTIVE DIVIDENDS EMISSION(ρ) VTRUST VPERMIT UPDATED AXON HOTKEY_SS58
2902miner default 0 True 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 0.00000 14 none 5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
29031 1 2 τ0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ρ0 0.00000
2904 Wallet balance: τ0.0
2905```
2906
2907Check that your miner has been registered:
2908
2909```bash
2910btcli wallet overview --wallet.name miner --subtensor.network test
2911```
2912
2913The above command will display the below:
2914
2915```bash
2916Subnet: 1
2917COLDKEY HOTKEY UID ACTIVE STAKE(τ) RANK TRUST CONSENSUS INCENTIVE DIVIDENDS EMISSION(ρ) VTRUST VPERMIT UPDATED AXON HOTKEY_SS58
2918miner default 1 True 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 0.00000 14 none 5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
29191 1 2 τ0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ρ0 0.00000
2920 Wallet balance: τ0.0
2921```
2922
2923## 8. Run subnet miner and subnet validator
2924
2925Run the subnet miner:
2926
2927```bash
2928python neurons/miner.py --netuid 1 --subtensor.network test --wallet.name miner --wallet.hotkey default --logging.debug
2929```
2930
2931You will see the below terminal output:
2932
2933```bash
2934>> 2023-08-08 16:58:11.223 | INFO | Running miner for subnet: 1 on network: ws://127.0.0.1:9946 with config: ...
2935```
2936
2937Next, run the subnet validator:
2938
2939```bash
2940python neurons/validator.py --netuid 1 --subtensor.network test --wallet.name validator --wallet.hotkey default --logging.debug
2941```
2942
2943You will see the below terminal output:
2944
2945```bash
2946>> 2023-08-08 16:58:11.223 | INFO | Running validator for subnet: 1 on network: ws://127.0.0.1:9946 with config: ...
2947```
2948
2949
2950## 9. Get emissions flowing
2951
2952Register to the root network using the `btcli`:
2953
2954```bash
2955btcli root register --subtensor.network test
2956```
2957
2958Then set your weights for the subnet:
2959
2960```bash
2961btcli root weights --subtensor.network test
2962```
2963
2964## 10. Stopping your nodes
2965
2966To stop your nodes, press CTRL + C in the terminal where the nodes are running.
2967
2968
2969
2970---
2971File: /docs/what_are_subnets.md
2972---
2973
2974# What is Bittensor?
2975Bittensor is a network where computers validate the work that other computers contribute to the network - the work what is most valuable to the collective will be rewarded
2976
2977Bittensor is a catalyst to the open-source developers and smaller AI research labs now have a financial incentive for fine-tuning open foundational models
2978
2979Bittensor is a library of machine intelligence that continuously grows and shares knowledge amongst peers
2980
2981# What is a subnet?
2982
2983Bittensor is releasing its own language for creating incentive mechanisms. This allows developers to build incentive systems on Bittensor, tapping into our web of intelligence to develop markets of the developer’s choosings
2984
2985Subnet 1, an incentive system for machine intelligence production, showcases the enormous potential of markets to procure huge amounts of resources. Releasing user-created subnets is set to create a cambrian explosion of additional resources into the Bittensor ecosystem
2986
2987# Why should you care?
2988
2989As an open-source developer, you now have the ability to write your own incentive mechanisms without creating an entirely new chain. By tapping into Bittensor’s network of intelligence, you can incentivize AI models from all over the world to perform tasks of your choosing (i.e., image generation, storage, compute access, etc.) - the possibilities are truly endless
2990
2991The release of subnets also offers the potential to pull these tools into a shared network, making all the ingredients necessary to create intelligence available within one network, governed by one token
2992
2993You get to play a vital role in helping bootstrap what could one day become one of the most powerful networks in the world - and you make money by doing so!
2994
2995By incentivizing developers to create their own markets, Bittensor is set to become a one-stop-shop for those seeking all the compute requirements for building unstoppable applications on top of an incentivized infrastructure
2996
2997# Deeper dive
2998Check out the Bittensor about page [here](https://bittensor.com/about) for more details about what the bittensor paradigm is and why subnets are revolutionary technology.
2999
3000Also see our [linktree](https://linktr.ee/opentensor) for more information.
3001
3002
3003---
3004File: /neurons/__init__.py
3005---
3006
3007
3008
3009
3010---
3011File: /neurons/miner.py
3012---
3013
3014# The MIT License (MIT)
3015# Copyright © 2023 Yuma Rao
3016# Copyright © 2023 Omega Labs, Inc.
3017
3018# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
3019# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
3020# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
3021# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
3022
3023# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
3024# the Software.
3025
3026# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
3027# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
3028# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
3029# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
3030# DEALINGS IN THE SOFTWARE.
3031
3032import os
3033# Set USE_TORCH=1 environment variable to use torch instead of numpy
3034os.environ["USE_TORCH"] = "1"
3035
3036import time
3037import json
3038import typing
3039import requests
3040import asyncio
3041import bittensor as bt
3042
3043# Bittensor Miner Template:
3044import omega
3045
3046from omega.base.miner import BaseMinerNeuron
3047from omega.imagebind_wrapper import ImageBind
3048from omega.miner_utils import search_and_diarize_youtube_videos, search_and_embed_youtube_videos
3049from omega.augment import LocalLLMAugment, OpenAIAugment, NoAugment
3050from omega.utils.config import QueryAugment
3051from omega.constants import VALIDATOR_TIMEOUT, VALIDATOR_TIMEOUT_AUDIO
3052from omega.diarization_pipeline import CustomDiarizationPipeline
3053
3054class Miner(BaseMinerNeuron):
3055 """
3056 Your miner neuron class. You should use this class to define your miner's behavior. In particular, you should replace the forward function with your own logic. You may also want to override the blacklist and priority functions according to your needs.
3057 """
3058 def __init__(self, config=None):
3059 super(Miner, self).__init__(config=config)
3060 query_augment_type = QueryAugment(self.config.neuron.query_augment)
3061 if query_augment_type == QueryAugment.NoAugment:
3062 self.augment = NoAugment(device=self.config.neuron.device)
3063 elif query_augment_type == QueryAugment.LocalLLMAugment:
3064 self.augment = LocalLLMAugment(device=self.config.neuron.device)
3065 elif query_augment_type == QueryAugment.OpenAIAugment:
3066 self.augment = OpenAIAugment(device=self.config.neuron.device)
3067 else:
3068 raise ValueError("Invalid query augment")
3069
3070
3071 self.diarization_pipeline = CustomDiarizationPipeline(
3072 overlap_detection_model_id = "tezuesh/overlapped-speech-detection",
3073 diarization_model_id="tezuesh/diarization",
3074 # device="cuda"
3075 )
3076 self.imagebind = ImageBind(v2=True)
3077
3078 async def forward_videos(
3079 self, synapse: omega.protocol.Videos
3080 ) :
3081 # Scrape Youtube videos
3082 bt.logging.info(f"Received scraping request: {synapse.num_videos} videos for query '{synapse.query}'")
3083
3084 start = time.time()
3085
3086 synapse.video_metadata = search_and_embed_youtube_videos(
3087 self.augment(synapse.query), synapse.num_videos, self.imagebind
3088 )
3089
3090 time_elapsed = time.time() - start
3091
3092 if len(synapse.video_metadata) == synapse.num_videos and time_elapsed < VALIDATOR_TIMEOUT:
3093 bt.logging.info(f"–––––– SCRAPING SUCCEEDED: Scraped {len(synapse.video_metadata)}/{synapse.num_videos} videos in {time_elapsed} seconds.")
3094 else:
3095 bt.logging.error(f"–––––– SCRAPING FAILED: Scraped {len(synapse.video_metadata)}/{synapse.num_videos} videos in {time_elapsed} seconds.")
3096
3097
3098 return synapse
3099
3100 async def forward_audios(
3101 self, synapse: omega.protocol.Audios
3102 ) -> omega.protocol.Audios:
3103 bt.logging.info(f"Received youtube audio scraping and diarization request: {synapse.num_audios} audios for query '{synapse.query}'")
3104
3105 start = time.time()
3106
3107 synapse.audio_metadata = search_and_diarize_youtube_videos(
3108 self.augment(synapse.query), synapse.num_audios, self.diarization_pipeline, self.imagebind
3109 )
3110
3111 time_elapsed = time.time() - start
3112
3113 if len(synapse.audio_metadata) == synapse.num_audios and time_elapsed < VALIDATOR_TIMEOUT_AUDIO:
3114 bt.logging.info(f"–––––– SCRAPING SUCCEEDED: Scraped {len(synapse.audio_metadata)}/{synapse.num_audios} audios in {time_elapsed} seconds.")
3115 else:
3116 bt.logging.error(f"–––––– SCRAPING FAILED: Scraped {len(synapse.audio_metadata)}/{synapse.num_audios} audios in {time_elapsed} seconds.")
3117 return synapse
3118
3119 async def blacklist(
3120 self, synapse: bt.Synapse
3121 ) -> typing.Tuple[bool, str]:
3122 """
3123 Determines whether an incoming request should be blacklisted and thus ignored. Your implementation should
3124 define the logic for blacklisting requests based on your needs and desired security parameters.
3125
3126 Blacklist runs before the synapse data has been deserialized (i.e. before synapse.data is available).
3127 The synapse is instead contructed via the headers of the request. It is important to blacklist
3128 requests before they are deserialized to avoid wasting resources on requests that will be ignored.
3129
3130 Args:
3131 synapse (template.protocol.Videos): A synapse object constructed from the headers of the incoming request.
3132
3133 Returns:
3134 Tuple[bool, str]: A tuple containing a boolean indicating whether the synapse's hotkey is blacklisted,
3135 and a string providing the reason for the decision.
3136
3137 This function is a security measure to prevent resource wastage on undesired requests. It should be enhanced
3138 to include checks against the metagraph for entity registration, validator status, and sufficient stake
3139 before deserialization of synapse data to minimize processing overhead.
3140
3141 Example blacklist logic:
3142 - Reject if the hotkey is not a registered entity within the metagraph.
3143 - Consider blacklisting entities that are not validators or have insufficient stake.
3144
3145 In practice it would be wise to blacklist requests from entities that are not validators, or do not have
3146 enough stake. This can be checked via metagraph.S and metagraph.validator_permit. You can always attain
3147 the uid of the sender via a metagraph.hotkeys.index( synapse.dendrite.hotkey ) call.
3148
3149 Otherwise, allow the request to be processed further.
3150 """
3151 if not synapse.dendrite.hotkey:
3152 return True, "Hotkey not provided"
3153 registered = synapse.dendrite.hotkey in self.metagraph.hotkeys
3154 if self.config.blacklist.allow_non_registered and not registered:
3155 return False, "Allowing un-registered hotkey"
3156 elif not registered:
3157 bt.logging.trace(
3158 f"Blacklisting un-registered hotkey {synapse.dendrite.hotkey}"
3159 )
3160 return True, f"Unrecognized hotkey {synapse.dendrite.hotkey}"
3161
3162 uid = self.metagraph.hotkeys.index(synapse.dendrite.hotkey)
3163 if self.config.blacklist.force_validator_permit:
3164 # If the config is set to force validator permit, then we should only allow requests from validators.
3165 if not self.metagraph.validator_permit[uid]:
3166 bt.logging.warning(
3167 f"Blacklisting a request from non-validator hotkey {synapse.dendrite.hotkey}"
3168 )
3169 return True, "Non-validator hotkey"
3170
3171 stake = self.metagraph.S[uid].item()
3172 if self.config.blacklist.validator_min_stake and stake < self.config.blacklist.validator_min_stake:
3173 bt.logging.warning(f"Blacklisting request from {synapse.dendrite.hotkey} [uid={uid}], not enough stake -- {stake}")
3174 return True, "Stake below minimum"
3175
3176 bt.logging.trace(
3177 f"Not Blacklisting recognized hotkey {synapse.dendrite.hotkey}"
3178 )
3179 return False, "Hotkey recognized!"
3180
3181 async def blacklist_videos(
3182 self, synapse: omega.protocol.Videos
3183 ) -> typing.Tuple[bool, str]:
3184 return await self.blacklist(synapse)
3185
3186 async def blacklist_audios(
3187 self, synapse: omega.protocol.Audios
3188 ) -> typing.Tuple[bool, str]:
3189 return await self.blacklist(synapse)
3190
3191 async def priority(self, synapse: bt) -> float:
3192 """
3193 The priority function determines the order in which requests are handled. More valuable or higher-priority
3194 requests are processed before others. You should design your own priority mechanism with care.
3195
3196 This implementation assigns priority to incoming requests based on the calling entity's stake in the metagraph.
3197
3198 Args:
3199 synapse (template.protocol.Audios): The synapse object that contains metadata about the incoming request.
3200
3201 Returns:
3202 float: A priority score derived from the stake of the calling entity.
3203
3204 Miners may recieve messages from multiple entities at once. This function determines which request should be
3205 processed first. Higher values indicate that the request should be processed first. Lower values indicate
3206 that the request should be processed later.
3207
3208 Example priority logic:
3209 - A higher stake results in a higher priority value.
3210 """
3211 caller_uid = self.metagraph.hotkeys.index(
3212 synapse.dendrite.hotkey
3213 ) # Get the caller index.
3214 prirority = float(
3215 self.metagraph.S[caller_uid]
3216 ) # Return the stake as the priority.
3217 bt.logging.trace(
3218 f"Prioritizing {synapse.dendrite.hotkey} with value: ", prirority
3219 )
3220 return prirority
3221
3222 async def priority_videos(
3223 self, synapse: omega.protocol.Videos
3224 ) -> float:
3225 return await self.priority(synapse)
3226
3227 async def priority_audios(
3228 self, synapse: omega.protocol.Audios
3229 ) -> float:
3230 return await self.priority(synapse)
3231
3232 def save_state(self):
3233 """
3234 We define this function to avoid printing out the log message in the BaseNeuron class
3235 that says `save_state() not implemented`.
3236 """
3237 pass
3238
3239# This is the main function, which runs the miner.
3240if __name__ == "__main__":
3241 with Miner() as miner:
3242 while True:
3243 bt.logging.info("Miner running...", time.time())
3244 time.sleep(5)
3245
3246
3247
3248---
3249File: /neurons/test_miner.py
3250---
3251
3252from omega.miner_utils import search_and_embed_youtube_videos, ImageBind
3253from omega.constants import VALIDATOR_TIMEOUT
3254from omega.protocol import Videos
3255import time
3256import requests
3257
3258imagebind = ImageBind(v2=True)
3259start = time.time()
3260query = "wine and winemaking"
3261num_videos = 8
3262video_metadata_list = search_and_embed_youtube_videos(query, num_videos, imagebind)
3263time_elapsed = time.time() - start
3264
3265if time_elapsed > VALIDATOR_TIMEOUT or len(video_metadata_list) < num_videos:
3266 if time_elapsed > VALIDATOR_TIMEOUT:
3267 print(f"Searching took {time_elapsed} seconds, which is longer than the validator timeout of {VALIDATOR_TIMEOUT} seconds")
3268
3269 if len(video_metadata_list) < num_videos:
3270 print(f"Only got {len(video_metadata_list)} videos, which is less than the requested {num_videos} videos")
3271else:
3272 print(f"SUCCESS! Search and embed took {time_elapsed} seconds and got {len(video_metadata_list)} videos")
3273
3274
3275if len(video_metadata_list) == 0:
3276 print("No videos found")
3277else:
3278 videos = Videos(query=query, num_videos=num_videos, video_metadata=video_metadata_list)
3279 response = requests.get(
3280 "https://dev-validator.api.omega-labs.ai/api/count_unique",
3281 json=videos.to_serializable_dict(videos)
3282 )
3283 print(response.json())
3284
3285
3286
3287---
3288File: /neurons/validator.py
3289---
3290
3291# The MIT License (MIT)
3292# Copyright © 2023 Omega Labs, Inc.
3293
3294# Copyright © 2023 Yuma Rao
3295# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
3296# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
3297# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
3298# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
3299
3300# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
3301# the Software.
3302
3303# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
3304# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
3305# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
3306# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
3307# DEALINGS IN THE SOFTWARE.
3308
3309import os
3310# Set USE_TORCH=1 environment variable to use torch instead of numpy
3311os.environ["USE_TORCH"] = "1"
3312
3313from aiohttp import ClientSession, BasicAuth
3314import asyncio
3315from typing import List, Tuple, Optional, BinaryIO, Dict
3316import datetime as dt
3317import random
3318import traceback
3319import requests
3320import math
3321import soundfile as sf
3322from io import BytesIO
3323import json
3324import numpy as np
3325# Bittensor
3326import bittensor as bt
3327import torch
3328import torch.nn.functional as F
3329from torch.nn import CosineSimilarity
3330import wandb
3331import base64
3332# Bittensor Validator Template:
3333from omega.utils.uids import get_random_uids
3334from omega.protocol import Videos, VideoMetadata, AudioMetadata, Audios
3335from omega.constants import (
3336 VALIDATOR_TIMEOUT,
3337 VALIDATOR_TIMEOUT_MARGIN,
3338 VALIDATOR_TIMEOUT_AUDIO,
3339 MAX_VIDEO_LENGTH,
3340 MIN_VIDEO_LENGTH,
3341 CHECK_PROBABILITY,
3342 DIFFERENCE_THRESHOLD,
3343 SIMILARITY_THRESHOLD,
3344 VIDEO_DOWNLOAD_TIMEOUT,
3345 MIN_SCORE,
3346 FAKE_VIDEO_PUNISHMENT,
3347 QUERY_RELEVANCE_SCALING_FACTOR,
3348 DESCRIPTION_RELEVANCE_SCALING_FACTOR,
3349 VIDEO_RELEVANCE_WEIGHT,
3350 FOCUS_REWARDS_PERCENT,
3351 AUDIO_REWARDS_PERCENT,
3352 DESCRIPTION_LENGTH_WEIGHT,
3353 MIN_LENGTH_BOOST_TOKEN_COUNT,
3354 MAX_LENGTH_BOOST_TOKEN_COUNT,
3355 STUFFED_DESCRIPTION_PUNISHMENT,
3356 FOCUS_MIN_SCORE,
3357 MIN_AUDIO_LENGTH_SECONDS,
3358 MAX_AUDIO_LENGTH_SECONDS,
3359 MIN_AUDIO_LENGTH_SCORE,
3360 SPEECH_CONTENT_SCALING_FACTOR,
3361 SPEAKER_DOMINANCE_SCALING_FACTOR,
3362 BACKGROUND_NOISE_SCALING_FACTOR,
3363 UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR,
3364 AUDIO_LENGTH_SCALING_FACTOR,
3365 AUDIO_QUALITY_SCALING_FACTOR,
3366 DIARIZATION_SCALING_FACTOR,
3367 AUDIO_QUERY_RELEVANCE_SCALING_FACTOR
3368)
3369from omega import video_utils, unstuff
3370from omega.imagebind_wrapper import ImageBind, Embeddings, run_async, LENGTH_TOKENIZER
3371from omega.text_similarity import get_text_similarity_score
3372from omega.diarization_metric import calculate_diarization_metrics
3373from omega.audio_scoring import AudioScore
3374
3375# import base validator class which takes care of most of the boilerplate
3376from omega.base.validator import BaseValidatorNeuron
3377
3378NO_RESPONSE_MINIMUM = 0.005
3379GPU_SEMAPHORE = asyncio.Semaphore(1)
3380DOWNLOAD_SEMAPHORE = asyncio.Semaphore(5)
3381
3382class Validator(BaseValidatorNeuron):
3383 """
3384 Your validator neuron class. You should use this class to define your validator's behavior. In particular, you should replace the forward function with your own logic.
3385
3386 This class inherits from the BaseValidatorNeuron class, which in turn inherits from BaseNeuron. The BaseNeuron class takes care of routine tasks such as setting up wallet, subtensor, metagraph, logging directory, parsing config, etc. You can override any of the methods in BaseNeuron if you need to customize the behavior.
3387
3388 This class provides reasonable default behavior for a validator such as keeping a moving average of the scores of the miners and using them to set weights at the end of each epoch. Additionally, the scores are reset for new hotkeys at the end of each epoch.
3389 """
3390
3391 def __init__(self, config=None):
3392 super(Validator, self).__init__(config=config)
3393 self.audio_score = AudioScore()
3394 bt.logging.info("load_state()")
3395 self.load_state()
3396 self.successfully_started_wandb = False
3397
3398 if not self.config.wandb.off:
3399 if os.getenv("WANDB_API_KEY"):
3400 self.new_wandb_run()
3401 self.successfully_started_wandb = True
3402 else:
3403 bt.logging.exception("WANDB_API_KEY not found. Set it with `export WANDB_API_KEY=<your API key>`. Alternatively, you can disable W&B with --wandb.off, but it is strongly recommended to run with W&B enabled.")
3404 self.successfully_started_wandb = False
3405 else:
3406 bt.logging.warning("Running with --wandb.off. It is strongly recommended to run with W&B enabled.")
3407 self.successfully_started_wandb = False
3408
3409 api_root = (
3410 "https://dev-validator.api.omega-labs.ai"
3411 if self.config.subtensor.network == "test" else
3412 "https://validator.api.omega-labs.ai"
3413 )
3414 self.validation_endpoint = f"{api_root}/api/validate"
3415 self.proxy_endpoint = f"{api_root}/api/get_proxy"
3416 self.novelty_scores_endpoint = f"{api_root}/api/get_pinecone_novelty"
3417 self.upload_video_metadata_endpoint = f"{api_root}/api/upload_video_metadata"
3418 self.upload_audio_metadata_endpoint = f"{api_root}/api/upload_audio_metadata"
3419 self.focus_rewards_percent_endpoint = f"{api_root}/api/focus/get_rewards_percent"
3420 self.focus_miner_purchases_endpoint = f"{api_root}/api/focus/miner_purchase_scores"
3421 self.num_videos = 8
3422 self.num_audios = 4
3423 self.client_timeout_seconds = VALIDATOR_TIMEOUT + VALIDATOR_TIMEOUT_MARGIN
3424 self.client_timeout_seconds_audio = VALIDATOR_TIMEOUT_AUDIO + VALIDATOR_TIMEOUT_MARGIN
3425 # load topics from topics URL (CSV) or fallback to local topics file
3426 self.load_topics_start = dt.datetime.now()
3427 self.all_topics = self.load_topics()
3428
3429 self.imagebind = None
3430
3431 self.load_focus_rewards_start = dt.datetime.now()
3432 self.FOCUS_REWARDS_PERCENT = self.load_focus_rewards_percent() # 2.5%
3433 self.AUDIO_REWARDS_PERCENT = AUDIO_REWARDS_PERCENT # 12.5%
3434 self.YOUTUBE_REWARDS_PERCENT = 1.0 - self.FOCUS_REWARDS_PERCENT - self.AUDIO_REWARDS_PERCENT # 85%
3435
3436 if not self.config.neuron.decentralization.off:
3437 if torch.cuda.is_available():
3438 bt.logging.info(f"Running with decentralization enabled, thank you Bittensor Validator!")
3439 self.decentralization = True
3440 self.imagebind = ImageBind(v2=True)
3441 else:
3442 bt.logging.warning(f"Attempting to run decentralization, but no GPU found. Please see min_compute.yml for minimum resource requirements.")
3443 self.decentralization = False
3444 else:
3445 bt.logging.warning("Running with --decentralization.off. It is strongly recommended to run with decentralization enabled.")
3446 self.decentralization = False
3447
3448
3449 def new_wandb_run(self):
3450 # Shoutout SN13 for the wandb snippet!
3451 """Creates a new wandb run to save information to."""
3452 # Create a unique run id for this run.
3453 now = dt.datetime.now()
3454 self.wandb_run_start = now
3455 run_id = now.strftime("%Y-%m-%d_%H-%M-%S")
3456 name = "validator-" + str(self.uid) + "-" + run_id
3457 self.wandb_run = wandb.init(
3458 name=name,
3459 project="omega-sn24-validator-logs",
3460 entity="omega-labs",
3461 config={
3462 "uid": self.uid,
3463 "hotkey": self.wallet.hotkey.ss58_address,
3464 "run_name": run_id,
3465 "type": "validator",
3466 },
3467 allow_val_change=True,
3468 anonymous="allow",
3469 )
3470
3471 bt.logging.debug(f"Started a new wandb run: {name}")
3472
3473 def load_topics(self):
3474 # get topics from CSV URL and load them into our topics list
3475 try:
3476 response = requests.get(self.config.topics_url)
3477 response.raise_for_status()
3478 # split the response text into a list of topics and trim any whitespace
3479 all_topics = [line.strip() for line in response.text.split("\n")]
3480 bt.logging.info(f"Loaded {len(all_topics)} topics from {self.config.topics_url}")
3481 except Exception as e:
3482 bt.logging.error(f"Error loading topics from URL {self.config.topics_url}: {e}")
3483 traceback.print_exc()
3484 bt.logging.info(f"Using fallback topics from {self.config.topics_path}")
3485 all_topics = [line.strip() for line in open(self.config.topics_path) if line.strip()]
3486 bt.logging.info(f"Loaded {len(all_topics)} topics from {self.config.topics_path}")
3487 return all_topics
3488
3489 def load_focus_rewards_percent(self):
3490 # get focus rewards percent from API endpoint or fallback to default
3491 try:
3492 response = requests.get(self.focus_rewards_percent_endpoint)
3493 response.raise_for_status()
3494 rewards_percent = float(response.text)
3495 bt.logging.info(f"Loaded focus rewards percent of {rewards_percent} from {self.focus_rewards_percent_endpoint}")
3496 except Exception as e:
3497 bt.logging.error(f"Error loading topics from URL {self.config.topics_url}: {e}")
3498 traceback.print_exc()
3499 bt.logging.info(f"Using fallback focus rewards percent of {FOCUS_REWARDS_PERCENT}")
3500 rewards_percent = FOCUS_REWARDS_PERCENT
3501 return rewards_percent
3502
3503 async def forward(self):
3504 """
3505 Validator forward pass. Consists of:
3506 - Generating the query
3507 - Querying the miners
3508 - Getting the responses
3509 - Rewarding the miners
3510 - Updating the scores
3511 """
3512 """
3513 The forward function is called by the validator every time step.
3514
3515 It is responsible for querying the network and scoring the responses.
3516
3517 Args:
3518 self (:obj:`bittensor.neuron.Neuron`): The neuron object which contains all the necessary state for the validator.
3519
3520 """
3521 miner_uids = get_random_uids(self, k=self.config.neuron.sample_size)
3522 # miner_uids = torch.LongTensor([0])
3523
3524 if len(miner_uids) == 0:
3525 bt.logging.info("No miners available")
3526 return
3527
3528 """ START YOUTUBE AUDIO PROCESSING AND SCORING """
3529 bt.logging.info("===== YOUTUBE REQUESTS, AUDIO PROCESSING, AND SCORING =====")
3530 # The dendrite client queries the network.
3531 query = random.choice(self.all_topics) + " podcast"
3532 bt.logging.info(f"Sending query '{query}' to miners {miner_uids}")
3533 audio_input_synapse = Audios(query=query, num_audios=self.num_audios)
3534 bt.logging.info(f"audio_input_synapse: {audio_input_synapse}")
3535 # exit(0)
3536 axons = [self.metagraph.axons[uid] for uid in miner_uids]
3537 audio_responses = await self.dendrite(
3538 # Send the query to selected miner axons in the network.
3539 axons=axons,
3540 synapse=audio_input_synapse,
3541 deserialize=False,
3542 timeout=self.client_timeout_seconds_audio,
3543 )
3544 audio_working_miner_uids = []
3545 audio_finished_responses = []
3546
3547 for response in audio_responses:
3548 if response.audio_metadata is None or not response.axon or not response.axon.hotkey:
3549 continue
3550
3551 uid = [uid for uid, axon in zip(miner_uids, axons) if axon.hotkey == response.axon.hotkey][0]
3552 audio_working_miner_uids.append(uid)
3553 audio_finished_responses.append(response)
3554
3555 if len(audio_working_miner_uids) == 0:
3556 bt.logging.info("No miner responses available for audio synapse")
3557
3558 # Log the results for monitoring purposes.
3559 bt.logging.info(f"Received audio responses: {audio_responses}")
3560 # Adjust the scores based on responses from miners.
3561 try:
3562 audio_rewards_list = await self.handle_checks_and_reward_audio(input_synapse=audio_input_synapse, responses=audio_finished_responses)
3563 except Exception as e:
3564 bt.logging.error(f"Error in handle_checks_and_rewards_audio: {e}")
3565 traceback.print_exc()
3566 return
3567
3568 audio_rewards = []
3569 audio_reward_uids = []
3570 for r, r_uid in zip(audio_rewards_list, audio_working_miner_uids):
3571 if r is not None:
3572 audio_rewards.append(r)
3573 audio_reward_uids.append(r_uid)
3574 audio_rewards = torch.FloatTensor(audio_rewards).to(self.device)
3575 self.update_audio_scores(audio_rewards, audio_reward_uids)
3576
3577 # give min reward to miners who didn't respond
3578 bad_miner_uids = [uid for uid in miner_uids if uid not in audio_working_miner_uids]
3579 penalty_tensor = torch.FloatTensor([NO_RESPONSE_MINIMUM] * len(bad_miner_uids)).to(self.device)
3580 self.update_audio_scores(penalty_tensor, bad_miner_uids)
3581
3582 for reward, miner_uid in zip(audio_rewards, audio_reward_uids):
3583 bt.logging.info(f"Rewarding miner={miner_uid} with reward={reward} for audio dataset")
3584
3585 for penalty, miner_uid in zip(penalty_tensor, bad_miner_uids):
3586 bt.logging.info(f"Penalizing miner={miner_uid} with penalty={penalty}")
3587
3588 """ END YOUTUBE AUDIO PROCESSING AND SCORING """
3589
3590 """ START YOUTUBE SYNAPSE REQUESTS, PROCESSING, AND SCORING """
3591 bt.logging.info("===== YOUTUBE REQUESTS, PROCESSING, AND SCORING =====")
3592 # The dendrite client queries the network.
3593 query = random.choice(self.all_topics)
3594 bt.logging.info(f"Sending query '{query}' to miners {miner_uids}")
3595 input_synapse = Videos(query=query, num_videos=self.num_videos)
3596 axons = [self.metagraph.axons[uid] for uid in miner_uids]
3597 responses = await self.dendrite(
3598 # Send the query to selected miner axons in the network.
3599 axons=axons,
3600 synapse=input_synapse,
3601 deserialize=False,
3602 timeout=self.client_timeout_seconds,
3603 )
3604
3605 working_miner_uids = []
3606 finished_responses = []
3607
3608 for response in responses:
3609 if response.video_metadata is None or not response.axon or not response.axon.hotkey:
3610 continue
3611
3612 uid = [uid for uid, axon in zip(miner_uids, axons) if axon.hotkey == response.axon.hotkey][0]
3613 working_miner_uids.append(uid)
3614 finished_responses.append(response)
3615
3616 if len(working_miner_uids) == 0:
3617 bt.logging.info("No miner responses available")
3618
3619 # Log the results for monitoring purposes.
3620 bt.logging.info(f"Received video responses: {responses}")
3621
3622 # Adjust the scores based on responses from miners.
3623 try:
3624 # Check if this validator is running decentralization
3625 if not self.decentralization:
3626 # if not, use validator API get_rewards system
3627 rewards_list = await self.get_rewards(input_synapse=input_synapse, responses=finished_responses)
3628 else:
3629 # if so, use decentralization logic with local GPU
3630 rewards_list = await self.handle_checks_and_rewards_youtube(input_synapse=input_synapse, responses=finished_responses)
3631 except Exception as e:
3632 bt.logging.error(f"Error in handle_checks_and_rewards_youtube: {e}")
3633 traceback.print_exc()
3634 return
3635
3636 # give reward to all miners who responded and had a non-null reward
3637 rewards = []
3638 reward_uids = []
3639 for r, r_uid in zip(rewards_list, working_miner_uids):
3640 if r is not None:
3641 rewards.append(r)
3642 reward_uids.append(r_uid)
3643 rewards = torch.FloatTensor(rewards).to(self.device)
3644 self.update_scores(rewards, reward_uids)
3645
3646 # give min reward to miners who didn't respond
3647 bad_miner_uids = [uid for uid in miner_uids if uid not in working_miner_uids]
3648 penalty_tensor = torch.FloatTensor([NO_RESPONSE_MINIMUM] * len(bad_miner_uids)).to(self.device)
3649 self.update_scores(penalty_tensor, bad_miner_uids)
3650
3651 for reward, miner_uid in zip(rewards, reward_uids):
3652 bt.logging.info(f"Rewarding miner={miner_uid} with reward={reward}")
3653
3654 for penalty, miner_uid in zip(penalty_tensor, bad_miner_uids):
3655 bt.logging.info(f"Penalizing miner={miner_uid} with penalty={penalty}")
3656 """ END YOUTUBE SYNAPSE REQUESTS, PROCESSING, AND SCORING """
3657
3658 """ START FOCUS VIDEOS PROCESSING AND SCORING """
3659 bt.logging.info("===== FOCUS VIDEOS PROCESSING AND SCORING =====")
3660 # Gather all focus videos purchased by the subset of miners
3661 focus_miner_uids = []
3662 focus_miner_hotkeys = []
3663
3664 # Get all the focus videos by iteratively calling the get_focus_videos() function.
3665 miner_hotkeys = []
3666 for miner_uid in miner_uids:
3667 miner_hotkeys.append(self.metagraph.hotkeys[miner_uid])
3668 focus_videos = await self.get_focus_videos(miner_hotkeys, miner_uids)
3669
3670 # Check responses and mark which miner uids and hotkeys have focus videos
3671 for focus_video in focus_videos:
3672 if focus_video and focus_video is not None and 'purchased_videos' in focus_video:
3673 focus_miner_uids.append(focus_video['miner_uid'])
3674 focus_miner_hotkeys.append(focus_video['miner_hotkey'])
3675
3676 if focus_videos is None or len(focus_miner_uids) == 0:
3677 bt.logging.info("No focus videos found for miners.")
3678 return
3679
3680 focus_rewards_list = await self.handle_checks_and_rewards_focus(focus_videos=focus_videos)
3681 # give reward to all miners with focus videos and had a non-null reward
3682 focus_rewards = []
3683 focus_reward_uids = []
3684 for r, r_uid in zip(focus_rewards_list, focus_miner_uids):
3685 if r is not None:
3686 focus_rewards.append(r)
3687 focus_reward_uids.append(r_uid)
3688 focus_rewards = torch.FloatTensor(focus_rewards).to(self.device)
3689 self.update_focus_scores(focus_rewards, focus_reward_uids)
3690
3691 # set focus score to 0 for miners who don't have any focus videos
3692 no_focus_videos_miner_uids = [uid for uid in miner_uids if uid not in focus_reward_uids]
3693 no_rewards_tensor = torch.FloatTensor([FOCUS_MIN_SCORE] * len(no_focus_videos_miner_uids)).to(self.device)
3694 self.update_focus_scores(no_rewards_tensor, no_focus_videos_miner_uids)
3695
3696 for reward, miner_uid in zip(focus_rewards, focus_reward_uids):
3697 bt.logging.info(f"Rewarding miner={miner_uid} with reward={reward} for focus videos")
3698
3699 for no_reward, miner_uid in zip(no_rewards_tensor, no_focus_videos_miner_uids):
3700 bt.logging.info(f"Scoring miner={miner_uid} with reward={no_reward} for no focus videos")
3701 """ END FOCUS VIDEOS PROCESSING AND SCORING """
3702
3703
3704
3705
3706 def metadata_check(self, metadata: List[VideoMetadata]) -> List[VideoMetadata]:
3707 return [
3708 video_metadata for video_metadata in metadata
3709 if (
3710 video_metadata.end_time - video_metadata.start_time <= MAX_VIDEO_LENGTH and
3711 video_metadata.end_time - video_metadata.start_time >= MIN_VIDEO_LENGTH
3712 )
3713 ]
3714
3715 def audio_metadata_check(self, metadata: List[AudioMetadata]) -> List[AudioMetadata]:
3716 return [
3717 audio_metadata for audio_metadata in metadata
3718 if (
3719 audio_metadata.end_time - audio_metadata.start_time <= MAX_VIDEO_LENGTH and
3720 audio_metadata.end_time - audio_metadata.start_time >= MIN_VIDEO_LENGTH
3721 )
3722 ]
3723
3724
3725 def filter_embeddings(self, embeddings: Embeddings, is_too_similar: List[bool]) -> Embeddings:
3726 """Filter the embeddings based on whether they are too similar to the query."""
3727 is_too_similar = torch.tensor(is_too_similar)
3728 if embeddings.video is not None:
3729 embeddings.video = embeddings.video[~is_too_similar]
3730 if embeddings.audio is not None:
3731 embeddings.audio = embeddings.audio[~is_too_similar]
3732 if embeddings.description is not None:
3733 embeddings.description = embeddings.description[~is_too_similar]
3734 return embeddings
3735
3736 def filter_stuffed_embeddings(self, embeddings: Embeddings, stuffed: List[Tuple[bool, float]]) -> Embeddings:
3737 """Filter the embeddings based on whether they are too similar to the query."""
3738 stuffed = torch.tensor([s for s, _ in stuffed])
3739 if embeddings.video is not None:
3740 embeddings.video = embeddings.video[~stuffed]
3741 if embeddings.audio is not None:
3742 embeddings.audio = embeddings.audio[~stuffed]
3743 if embeddings.description is not None:
3744 embeddings.description = embeddings.description[~stuffed]
3745 return embeddings
3746
3747 async def deduplicate_videos(self, embeddings: Embeddings) -> Videos:
3748 # return a list of booleans where True means the corresponding video is a duplicate i.e. is_similar
3749 video_tensor = embeddings.video
3750 num_videos = video_tensor.shape[0]
3751 cossim = CosineSimilarity(dim=1)
3752 is_similar = []
3753 for i in range(num_videos):
3754 similarity_score = cossim(video_tensor[[i]], video_tensor[i + 1:])
3755 has_duplicates = (similarity_score > SIMILARITY_THRESHOLD).any()
3756 is_similar.append(has_duplicates.item())
3757
3758 return is_similar
3759
3760 async def deduplicate_audios(self, embeddings: Embeddings) -> Audios:
3761 # return a list of booleans where True means the corresponding video is a duplicate i.e. is_similar
3762 audio_tensor = embeddings.audio
3763 num_audios = audio_tensor.shape[0]
3764 cossim = CosineSimilarity(dim=1)
3765 is_similar = []
3766 for i in range(num_audios):
3767 similarity_score = cossim(audio_tensor[[i]], audio_tensor[i + 1:])
3768 has_duplicates = (similarity_score > SIMILARITY_THRESHOLD).any()
3769 is_similar.append(has_duplicates.item())
3770
3771 return is_similar
3772
3773 def is_similar(self, emb_1: torch.Tensor, emb_2: List[float]) -> bool:
3774 return F.cosine_similarity(
3775 emb_1,
3776 torch.tensor(emb_2, device=emb_1.device).unsqueeze(0)
3777 ) > SIMILARITY_THRESHOLD
3778
3779 def strict_is_similar(self, emb_1: torch.Tensor, emb_2: List[float]) -> bool:
3780 return torch.allclose(emb_1, torch.tensor(emb_2, device=emb_1.device), atol=1e-4)
3781
3782 async def get_random_youtube_video(
3783 self,
3784 metadata,
3785 check_video: bool
3786 ):
3787 if not check_video and len(metadata) > 0:
3788 random_metadata = random.choice(metadata)
3789 return random_metadata, None
3790
3791 random_video = None
3792 metadata_copy = [v for v in metadata] # list shallow copy
3793 while random_video is None and len(metadata_copy) > 0:
3794 idx = random.randint(0, len(metadata_copy) - 1)
3795 random_metadata = metadata_copy.pop(idx)
3796 proxy_url = await self.get_proxy_url()
3797 if proxy_url is None:
3798 bt.logging.info("Issue getting proxy_url from API, not using proxy. Attempting download for random_video check")
3799 else:
3800 bt.logging.info("Got proxy_url from API. Attempting download for random_video check")
3801 try:
3802 async with DOWNLOAD_SEMAPHORE:
3803 random_video = await asyncio.wait_for(run_async(
3804 video_utils.download_youtube_video,
3805 random_metadata.video_id,
3806 random_metadata.start_time,
3807 random_metadata.end_time,
3808 proxy=proxy_url
3809 ), timeout=VIDEO_DOWNLOAD_TIMEOUT)
3810 except video_utils.IPBlockedException:
3811 # IP is blocked, cannot download video, check description only
3812 bt.logging.warning("WARNING: IP is blocked, cannot download video, checking description only")
3813 return random_metadata, None
3814 except video_utils.FakeVideoException:
3815 bt.logging.warning(f"WARNING: Video {random_metadata.video_id} is fake, punishing miner")
3816 return None
3817 except asyncio.TimeoutError:
3818 continue
3819
3820 # IP is not blocked, video is not fake, but video download failed for some reason. We don't
3821 # know why it failed so we won't punish the miner, but we will check the description only.
3822 if random_video is None:
3823 return random_metadata, None
3824
3825 return random_metadata, random_video
3826
3827 async def random_youtube_check(self, random_meta_and_vid: List[VideoMetadata]) -> bool:
3828 random_metadata, random_video = random_meta_and_vid
3829
3830 if random_video is None:
3831 desc_embeddings = self.imagebind.embed_text([random_metadata.description])
3832 is_similar_ = self.is_similar(desc_embeddings, random_metadata.description_emb)
3833 strict_is_similar_ = self.strict_is_similar(desc_embeddings, random_metadata.description_emb)
3834 bt.logging.info(f"Description similarity: {is_similar_}, strict description similarity: {strict_is_similar_}")
3835 return is_similar_
3836
3837 # Video downloaded, check all embeddings
3838 embeddings = self.imagebind.embed([random_metadata.description], [random_video])
3839 is_similar_ = (
3840 self.is_similar(embeddings.video, random_metadata.video_emb) and
3841 self.is_similar(embeddings.audio, random_metadata.audio_emb) and
3842 self.is_similar(embeddings.description, random_metadata.description_emb)
3843 )
3844 strict_is_similar_ = (
3845 self.strict_is_similar(embeddings.video, random_metadata.video_emb) and
3846 self.strict_is_similar(embeddings.audio, random_metadata.audio_emb) and
3847 self.strict_is_similar(embeddings.description, random_metadata.description_emb)
3848 )
3849 bt.logging.debug(f"Total similarity: {is_similar_}, strict total similarity: {strict_is_similar_}")
3850 return is_similar_
3851
3852
3853 async def random_audio_check(self, random_meta_and_audio: List[AudioMetadata]) -> bool:
3854 random_metadata, random_video = random_meta_and_audio
3855 bt.logging.info(f"inside random_audio_check, random_metadata: {random_metadata}, random_video: {random_video}")
3856 if random_video is None:
3857 return True
3858
3859 audio_bytes_from_youtube = video_utils.get_audio_bytes(random_video.name)
3860 audio_bytes_from_youtube = base64.b64encode(audio_bytes_from_youtube).decode('utf-8')
3861 audio_array_youtube, _ = sf.read(BytesIO(base64.b64decode(audio_bytes_from_youtube)))
3862 submitted_audio_bytes = random_metadata.audio_bytes
3863 audio_array_submitted, _ = sf.read(BytesIO(base64.b64decode(submitted_audio_bytes)))
3864
3865
3866 if np.array_equal(audio_array_youtube, audio_array_submitted) is False:
3867 bt.logging.warning("WARNING: Audio bytes do not match")
3868 return False
3869 return True
3870
3871 def compute_novelty_score_among_batch(self, emb: Embeddings) -> List[float]:
3872 video_tensor = emb.video
3873 num_videos = video_tensor.shape[0]
3874 novelty_scores = []
3875 for i in range(num_videos - 1):
3876 similarity_score = F.cosine_similarity(video_tensor[[i]], video_tensor[i + 1:]).max()
3877 novelty_scores.append(1 - similarity_score.item())
3878 novelty_scores.append(1.0) # last video is 100% novel
3879 return novelty_scores
3880
3881 def compute_novelty_score_among_batch_audio(self, emb: Embeddings) -> List[float]:
3882 audio_tensor = emb.audio
3883 num_audios = audio_tensor.shape[0]
3884 novelty_scores = []
3885 for i in range(num_audios - 1):
3886 similarity_score = F.cosine_similarity(audio_tensor[[i]], audio_tensor[i + 1:]).max()
3887 novelty_scores.append(1 - similarity_score.item())
3888 novelty_scores.append(1.0) # last video is 100% novel
3889 return novelty_scores
3890
3891 async def async_zero() -> None:
3892 return 0
3893
3894 # algorithm for computing final novelty score
3895 def compute_final_novelty_score(self, base_novelty_scores: List[float]) -> float:
3896 is_too_similar = [score < DIFFERENCE_THRESHOLD for score in base_novelty_scores]
3897 novelty_score = sum([
3898 score for score, is_too_similar
3899 in zip(base_novelty_scores, is_too_similar) if not is_too_similar
3900 ])
3901 return novelty_score
3902
3903 async def check_videos_and_calculate_rewards_youtube(
3904 self,
3905 input_synapse: Videos,
3906 videos: Videos
3907 ) -> Optional[float]:
3908 try:
3909 # return minimum score if no videos were found in video_metadata
3910 if len(videos.video_metadata) == 0:
3911 return MIN_SCORE
3912
3913 # check video_ids for fake videos
3914 if any(not video_utils.is_valid_youtube_id(video.video_id) for video in videos.video_metadata):
3915 return FAKE_VIDEO_PUNISHMENT
3916
3917 # check and filter duplicate metadata
3918 metadata = self.metadata_check(videos.video_metadata)[:input_synapse.num_videos]
3919 if len(metadata) < len(videos.video_metadata):
3920 bt.logging.info(f"Filtered {len(videos.video_metadata)} videos down to {len(metadata)} videos")
3921
3922 # if randomly tripped, flag our random check to pull a video from miner's submissions
3923 check_video = CHECK_PROBABILITY > random.random()
3924
3925 # pull a random video and/or description only
3926 random_meta_and_vid = await self.get_random_youtube_video(metadata, check_video)
3927 if random_meta_and_vid is None:
3928 return FAKE_VIDEO_PUNISHMENT
3929
3930 # execute the random check on metadata and video
3931 async with GPU_SEMAPHORE:
3932 passed_check = await self.random_youtube_check(random_meta_and_vid)
3933
3934 # punish miner if not passing
3935 if not passed_check:
3936 return FAKE_VIDEO_PUNISHMENT
3937 query_emb = await self.imagebind.embed_text_async([videos.query])
3938
3939 embeddings = Embeddings(
3940 video=torch.stack([torch.tensor(v.video_emb) for v in metadata]).to(self.imagebind.device),
3941 audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]).to(self.imagebind.device),
3942 description=torch.stack([torch.tensor(v.description_emb) for v in metadata]).to(self.imagebind.device),
3943 )
3944
3945 # check and deduplicate videos based on embedding similarity checks. We do this because we're not uploading to pinecone first.
3946 metadata_is_similar = await self.deduplicate_videos(embeddings)
3947 metadata = [metadata for metadata, too_similar in zip(metadata, metadata_is_similar) if not too_similar]
3948 embeddings = self.filter_embeddings(embeddings, metadata_is_similar)
3949 if len(metadata) < len(videos.video_metadata):
3950 bt.logging.info(f"Deduplicated {len(videos.video_metadata)} videos down to {len(metadata)} videos")
3951
3952 # return minimum score if no unique videos were found
3953 if len(metadata) == 0:
3954 return MIN_SCORE
3955
3956 # first get local novelty scores
3957 local_novelty_scores = self.compute_novelty_score_among_batch(embeddings)
3958 #bt.logging.debug(f"local_novelty_scores: {local_novelty_scores}")
3959 # second get the novelty scores from the validator api if not already too similar
3960 embeddings_to_check = [
3961 (embedding, metadata)
3962 for embedding, local_score, metadata in zip(embeddings.video, local_novelty_scores, metadata)
3963 if local_score >= DIFFERENCE_THRESHOLD
3964 ]
3965 # If there are embeddings to check, call get_novelty_scores once
3966 if embeddings_to_check:
3967 embeddings_to_check, metadata_to_check = zip(*embeddings_to_check)
3968 global_novelty_scores = await self.get_novelty_scores(metadata_to_check)
3969 else:
3970 # If no embeddings to check, return an empty list or appropriate default value
3971 global_novelty_scores = []
3972
3973 if global_novelty_scores is None or len(global_novelty_scores) == 0:
3974 bt.logging.error("Issue retrieving global novelty scores, returning None.")
3975 return None
3976 # #bt.logging.debug(f"global_novelty_scores: {global_novelty_scores}")
3977
3978 # calculate true novelty scores between local and global
3979 true_novelty_scores = [
3980 min(local_score, global_score) for local_score, global_score
3981 in zip(local_novelty_scores, global_novelty_scores)
3982 ]
3983 #bt.logging.debug(f"true_novelty_scores: {true_novelty_scores}")
3984
3985 pre_filter_metadata_length = len(metadata)
3986 # check scores from index for being too similar
3987 is_too_similar = [score < DIFFERENCE_THRESHOLD for score in true_novelty_scores]
3988 # filter out metadata too similar
3989 metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
3990 # filter out embeddings too similar
3991 embeddings = self.filter_embeddings(embeddings, is_too_similar)
3992 if len(metadata) < pre_filter_metadata_length:
3993 bt.logging.info(f"Filtering {pre_filter_metadata_length} videos down to {len(metadata)} videos that are too similar to videos in our index.")
3994
3995 # return minimum score if no unique videos were found
3996 if len(metadata) == 0:
3997 return MIN_SCORE
3998
3999 # Filter out "stuffed" descriptions.
4000 pre_filter_metadata_length = len(metadata)
4001 stuffed = [
4002 unstuff.is_stuffed(meta.description)
4003 for meta in metadata
4004 ]
4005 if any([garbage and confidence > 0.75 for garbage, confidence in stuffed]):
4006 bt.logging.warning("Stuffed description found with high confidence, penalizing the miner.")
4007 return STUFFED_DESCRIPTION_PUNISHMENT
4008
4009 # More stuffing.
4010 extraneous = [
4011 unstuff.check_extraneous_chunks(meta.description, meta.video_emb, meta.audio_emb, self.imagebind)
4012 for meta in metadata
4013 ]
4014 for really_bad, low_quality, total in extraneous:
4015 if really_bad > 5 or low_quality >= 16:
4016 bt.logging.info(f"Extraneous garbage found in text check {really_bad=} {low_quality=} {total=}")
4017 return STUFFED_DESCRIPTION_PUNISHMENT
4018
4019 metadata = [
4020 metadata[idx]
4021 for idx in range(len(metadata))
4022 if not stuffed[idx][0]
4023 and extraneous[idx][1] <= 15
4024 and extraneous[idx][2] <= 50
4025 ]
4026 if len(metadata) < pre_filter_metadata_length:
4027 bt.logging.info(f"Filtering {pre_filter_metadata_length} videos down to {len(metadata)} videos to remove token-stuffed descriptions.")
4028 if len(metadata) == 0:
4029 return MIN_SCORE
4030 embeddings = self.filter_stuffed_embeddings(embeddings, stuffed)
4031
4032 # Compute relevance scores
4033 video_description_relevance_scores = F.cosine_similarity(
4034 embeddings.video, embeddings.description
4035 ).tolist()
4036 audio_description_relevance_scores = F.cosine_similarity(
4037 embeddings.audio, embeddings.description
4038 ).tolist()
4039 video_query_relevance_scores = F.cosine_similarity(
4040 embeddings.video, query_emb
4041 ).tolist()
4042 audio_query_relevance_scores = F.cosine_similarity(
4043 embeddings.audio, query_emb
4044 ).tolist()
4045
4046 # Query relevance score now includes video cosim, audio cosim, and text cosim using higher quality text-only model.
4047 query_relevance_scores = [
4048 sum([
4049 video_query_relevance_scores[idx],
4050 audio_query_relevance_scores[idx],
4051 get_text_similarity_score(metadata[idx].description, videos.query),
4052 ]) / 3
4053 for idx in range(len(video_query_relevance_scores))
4054 ]
4055
4056 # Combine audio & visual description scores, weighted towards visual.
4057 description_relevance_scores = [
4058 sum([
4059 video_description_relevance_scores[idx] * VIDEO_RELEVANCE_WEIGHT,
4060 audio_description_relevance_scores[idx] * (1.0 - VIDEO_RELEVANCE_WEIGHT),
4061 ])
4062 for idx in range(len(video_description_relevance_scores))
4063 ]
4064
4065 # Scale description scores by number of unique tokens.
4066 length_scalers = []
4067 for idx in range(len(description_relevance_scores)):
4068 unique_tokens = LENGTH_TOKENIZER(metadata[idx].description)
4069 unique_tokens = set(unique_tokens[unique_tokens != 0][1:-1].tolist())
4070 unique_token_count = len(unique_tokens)
4071 if unique_token_count <= MIN_LENGTH_BOOST_TOKEN_COUNT:
4072 bt.logging.debug(f"Very few tokens, applying {DESCRIPTION_LENGTH_WEIGHT} penalty.")
4073 description_relevance_scores[idx] *= (1.0 - DESCRIPTION_LENGTH_WEIGHT)
4074 length_scalers.append(0)
4075 continue
4076 length_scaler = min(math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2), math.log(unique_token_count, 2)) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2)
4077 length_scaler /= (math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2))
4078 length_scalers.append(length_scaler)
4079 bt.logging.debug(f"Description length scaling factor = {length_scaler}")
4080 description_relevance_scores[idx] -= description_relevance_scores[idx] * DESCRIPTION_LENGTH_WEIGHT * (1.0 - length_scaler)
4081
4082 # Aggregate scores
4083 score = (
4084 (sum(description_relevance_scores) * DESCRIPTION_RELEVANCE_SCALING_FACTOR) +
4085 (sum(query_relevance_scores) * QUERY_RELEVANCE_SCALING_FACTOR)
4086 ) / 2 / videos.num_videos
4087 score = max(score, MIN_SCORE)
4088
4089 # Log all our scores
4090 bt.logging.info(f'''
4091 is_unique: {[not is_sim for is_sim in is_too_similar]},
4092 video cosine sim: {video_description_relevance_scores},
4093 audio cosine sim: {audio_description_relevance_scores},
4094 description relevance scores: {description_relevance_scores},
4095 query relevance scores: {query_relevance_scores},
4096 length scalers: {length_scalers},
4097 total score: {score}
4098 ''')
4099
4100 # Upload our final results to API endpoint for index and dataset insertion. Include leaderboard statistics
4101 miner_hotkey = videos.axon.hotkey
4102 upload_result = await self.upload_video_metadata(metadata, description_relevance_scores, query_relevance_scores, videos.query, None, score, miner_hotkey)
4103 if upload_result:
4104 bt.logging.info("Uploading of video metadata successful.")
4105 else:
4106 bt.logging.error("Issue uploading video metadata.")
4107
4108 return score
4109
4110 except Exception as e:
4111 bt.logging.error(f"Error in check_videos_and_calculate_rewards_youtube: {e}")
4112 traceback.print_exc()
4113 return None
4114
4115 async def check_videos_and_calculate_rewards_focus(
4116 self,
4117 videos,
4118 ) -> Optional[float]:
4119 try:
4120 # return if no purchased videos were found
4121 if len(videos["purchased_videos"]) == 0:
4122 bt.logging.info("No focus videos found for miner.")
4123 return None
4124
4125 total_score = 0
4126 # Aggregate scores
4127 for video in videos["purchased_videos"]:
4128 bt.logging.debug(f"Focus video score for {video['video_id']}: {video['video_score']}")
4129
4130 # Set final score, giving minimum if necessary
4131 score = max(float(video["video_score"]), MIN_SCORE)
4132 total_score += score
4133
4134 return total_score
4135 except Exception as e:
4136 bt.logging.error(f"Error in check_videos_and_calculate_rewards_focus: {e}")
4137 traceback.print_exc()
4138 return None
4139
4140 # Get all the focus reward results by iteratively calling your check_videos_and_calculate_rewards_focus() function.
4141 async def handle_checks_and_rewards_focus(
4142 self, focus_videos
4143 ) -> torch.FloatTensor:
4144
4145 rewards = await asyncio.gather(*[
4146 self.check_videos_and_calculate_rewards_focus(
4147 focus_video
4148 )
4149 for focus_video in focus_videos
4150 ])
4151 return rewards
4152
4153 # Get all the reward results by iteratively calling your check_videos_and_calculate_rewards_youtube() function.
4154 async def handle_checks_and_rewards_youtube(
4155 self,
4156 input_synapse: Videos,
4157 responses: List[Videos],
4158 ) -> torch.FloatTensor:
4159
4160 rewards = await asyncio.gather(*[
4161 self.check_videos_and_calculate_rewards_youtube(
4162 input_synapse,
4163 response.replace_with_input(input_synapse), # replace with input properties from input_synapse
4164 )
4165 for response in responses
4166 ])
4167 return rewards
4168
4169 async def handle_checks_and_reward_audio(
4170 self,
4171 input_synapse: Audios,
4172 responses: List[Audios],
4173 ) -> torch.FloatTensor:
4174 rewards = await asyncio.gather(*[
4175 self.check_audios_and_calculate_rewards(
4176 input_synapse,
4177 response,
4178 )
4179 for response in responses
4180 ])
4181 return rewards
4182
4183 async def upload_video_metadata(
4184 self,
4185 metadata: List[VideoMetadata],
4186 description_relevance_scores: List[float],
4187 query_relevance_scores: List[float],
4188 query: str,
4189 novelty_score: float,
4190 score: float,
4191 miner_hotkey: str
4192 ) -> bool:
4193 """
4194 Queries the validator api to get novelty scores for supplied videos.
4195 Returns a list of float novelty scores for each video after deduplicating.
4196
4197 Returns:
4198 - List[float]: The novelty scores for the miner's videos.
4199 """
4200 keypair = self.dendrite.keypair
4201 hotkey = keypair.ss58_address
4202 signature = f"0x{keypair.sign(hotkey).hex()}"
4203 try:
4204 async with ClientSession() as session:
4205 # Serialize the list of VideoMetadata
4206 # serialized_metadata = [item.dict() for item in metadata]
4207 serialized_metadata = [json.loads(item.model_dump_json()) for item in metadata]
4208 # Construct the JSON payload
4209 payload = {
4210 "metadata": serialized_metadata,
4211 "description_relevance_scores": description_relevance_scores,
4212 "query_relevance_scores": query_relevance_scores,
4213 "topic_query": query,
4214 "novelty_score": novelty_score,
4215 "total_score": score,
4216 "miner_hotkey": miner_hotkey
4217 }
4218
4219 async with session.post(
4220 self.upload_video_metadata_endpoint,
4221 auth=BasicAuth(hotkey, signature),
4222 json=payload,
4223 ) as response:
4224 response.raise_for_status()
4225 result = await response.json()
4226 return True
4227 except Exception as e:
4228 bt.logging.debug(f"Error trying upload_video_metadata_endpoint: {e}")
4229 traceback.print_exc()
4230 return False
4231
4232
4233 async def upload_audio_metadata(
4234 self,
4235 metadata: List[AudioMetadata],
4236 inverse_der: float,
4237 audio_length_score: float,
4238 audio_quality_total_score: float,
4239 audio_query_score: float,
4240 query: str,
4241 total_score: float,
4242 miner_hotkey: str
4243 ) -> bool:
4244 """
4245 Queries the validator api to get novelty scores for supplied audios.
4246 Returns a list of float novelty scores for each audio after deduplicating.
4247
4248 Returns:
4249 - List[float]: The novelty scores for the miner's audios.
4250 """
4251 keypair = self.dendrite.keypair
4252 hotkey = keypair.ss58_address
4253 signature = f"0x{keypair.sign(hotkey).hex()}"
4254 try:
4255 async with ClientSession() as session:
4256 # Serialize the list of AudioMetadata
4257 # serialized_metadata = [item.dict() for item in metadata]
4258 serialized_metadata = [json.loads(item.model_dump_json()) for item in metadata]
4259 # Construct the JSON payload
4260 payload = {
4261 "metadata": serialized_metadata,
4262 "inverse_der": inverse_der,
4263 "audio_length_score": audio_length_score,
4264 "audio_quality_total_score": audio_quality_total_score,
4265 "audio_query_score": audio_query_score,
4266 "topic_query": query,
4267 "total_score": total_score,
4268 "miner_hotkey": miner_hotkey
4269 }
4270
4271 async with session.post(
4272 self.upload_audio_metadata_endpoint,
4273 auth=BasicAuth(hotkey, signature),
4274 json=payload,
4275 ) as response:
4276 response.raise_for_status()
4277 result = await response.json()
4278 return True
4279 except Exception as e:
4280 bt.logging.debug(f"Error trying upload_audio_metadata_endpoint: {e}")
4281 traceback.print_exc()
4282 return False
4283
4284 async def get_novelty_scores(self, metadata: List[VideoMetadata]) -> List[float]:
4285 """
4286 Queries the validator api to get novelty scores for supplied videos.
4287 Returns a list of float novelty scores for each video after deduplicating.
4288
4289 Returns:
4290 - List[float]: The novelty scores for the miner's videos.
4291 """
4292 keypair = self.dendrite.keypair
4293 hotkey = keypair.ss58_address
4294 signature = f"0x{keypair.sign(hotkey).hex()}"
4295 try:
4296 async with ClientSession() as session:
4297 # Serialize the list of VideoMetadata
4298 serialized_metadata = [item.dict() for item in metadata]
4299
4300 async with session.post(
4301 self.novelty_scores_endpoint,
4302 auth=BasicAuth(hotkey, signature),
4303 json=serialized_metadata,
4304 ) as response:
4305 response.raise_for_status()
4306 novelty_scores = await response.json()
4307 return novelty_scores
4308
4309 except Exception as e:
4310 bt.logging.debug(f"Error trying novelty_scores_endpoint: {e}")
4311 traceback.print_exc()
4312 return None
4313
4314 # async def get_novelty_scores_audio(self, metadata: List[AudioMetadata]) -> List[float]:
4315
4316 async def get_proxy_url(self) -> str:
4317 """
4318 Queries the validator api to get a random proxy URL.
4319
4320 Returns:
4321 - str: A proxy URL
4322 """
4323 keypair = self.dendrite.keypair
4324 hotkey = keypair.ss58_address
4325 signature = f"0x{keypair.sign(hotkey).hex()}"
4326 try:
4327 async with ClientSession() as session:
4328 async with session.post(
4329 self.proxy_endpoint,
4330 auth=BasicAuth(hotkey, signature),
4331 ) as response:
4332 response.raise_for_status()
4333 proxy_url = await response.json()
4334 return proxy_url
4335 except Exception as e:
4336 bt.logging.debug(f"Error trying proxy_endpoint: {e}")
4337 traceback.print_exc()
4338 return None
4339
4340
4341 async def check_audios_and_calculate_rewards(
4342 self,
4343 input_synapse: Audios,
4344 audios: Audios
4345 ) -> Optional[float]:
4346 try:
4347 # return minimum score if no videos were found in video_metadata
4348 if len(audios.audio_metadata) == 0:
4349 return MIN_SCORE
4350 # check video_ids for fake videos
4351 if any(not video_utils.is_valid_youtube_id(audio.video_id) for audio in audios.audio_metadata):
4352 return FAKE_VIDEO_PUNISHMENT
4353
4354 # check and filter duplicate metadata
4355 metadata = self.audio_metadata_check(audios.audio_metadata)[:input_synapse.num_audios]
4356 if len(metadata) < len(audios.audio_metadata):
4357 bt.logging.info(f"Filtered {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
4358
4359
4360 # if randomly tripped, flag our random check to pull a video from miner's submissions
4361 check_video = CHECK_PROBABILITY > random.random()
4362
4363
4364
4365 # pull a random video and/or description only
4366
4367 random_meta_and_vid = await self.get_random_youtube_video(metadata, check_video)
4368 if random_meta_and_vid is None:
4369 return FAKE_VIDEO_PUNISHMENT
4370
4371 # execute the random check on metadata and video
4372 async with GPU_SEMAPHORE:
4373 if check_video:
4374 passed_check = await self.random_audio_check(random_meta_and_vid)
4375
4376 # punish miner if not passing
4377 if not passed_check:
4378 return FAKE_VIDEO_PUNISHMENT
4379 query_emb = await self.imagebind.embed_text_async([audios.query])
4380
4381 embeddings = Embeddings(
4382 video=None,
4383 audio=torch.stack([torch.tensor(a.audio_emb) for a in metadata]).to(self.imagebind.device),
4384 description=None
4385 )
4386
4387
4388 # check and deduplicate videos based on embedding similarity checks. We do this because we're not uploading to pinecone first.
4389 metadata_is_similar = await self.deduplicate_audios(embeddings)
4390 metadata = [metadata for metadata, too_similar in zip(metadata, metadata_is_similar) if not too_similar]
4391 embeddings = self.filter_embeddings(embeddings, metadata_is_similar)
4392
4393 if len(metadata) < len(audios.audio_metadata):
4394 bt.logging.info(f"Deduplicated {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
4395
4396 # return minimum score if no unique videos were found
4397 if len(metadata) == 0:
4398 return MIN_SCORE
4399
4400 # first get local novelty scores
4401 local_novelty_scores = self.compute_novelty_score_among_batch_audio(embeddings)
4402
4403 pre_filter_metadata_length = len(metadata)
4404 # check scores from index for being too similar
4405 is_too_similar = [score < DIFFERENCE_THRESHOLD for score in local_novelty_scores]
4406 # filter out metadata too similar
4407 metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
4408 # filter out embeddings too similar
4409 embeddings = self.filter_embeddings(embeddings, is_too_similar)
4410 if len(metadata) < pre_filter_metadata_length:
4411 bt.logging.info(f"Filtering {pre_filter_metadata_length} audios down to {len(metadata)} audios that are too similar to audios in our index.")
4412
4413 # return minimum score if no unique videos were found
4414 if len(metadata) == 0:
4415 return MIN_SCORE
4416
4417 # filter data based on audio length
4418 # Filter audios based on length constraints
4419 pre_filter_metadata_length = len(metadata)
4420 metadata = [
4421 meta for meta in metadata
4422 if (meta.end_time - meta.start_time) >= MIN_AUDIO_LENGTH_SECONDS
4423 and (meta.end_time - meta.start_time) <= MAX_AUDIO_LENGTH_SECONDS
4424 ]
4425
4426 if len(metadata) < pre_filter_metadata_length:
4427 bt.logging.info(f"Filtered {pre_filter_metadata_length} audios down to {len(metadata)} audios based on length constraints")
4428
4429 # Return minimum score if no audios remain after filtering
4430 if len(metadata) == 0:
4431 return MIN_SCORE
4432
4433 total_audio_length = sum((meta.end_time - meta.start_time) for meta in metadata)
4434 bt.logging.info(f"Average audio length: {total_audio_length/len(metadata):.2f} seconds")
4435 audio_length_score = total_audio_length/(self.num_audios*MAX_AUDIO_LENGTH_SECONDS)
4436
4437
4438 audio_query_score = sum(F.cosine_similarity(
4439 embeddings.audio, query_emb
4440 ).tolist())/len(metadata)
4441 bt.logging.info(f"Audio query score: {audio_query_score}")
4442
4443 # Randomly sample one audio for duration check
4444 selected_random_meta = random.choice(metadata)
4445 audio_array, sr = sf.read(BytesIO(base64.b64decode(selected_random_meta.audio_bytes)))
4446 audio_duration = len(audio_array) / sr
4447 bt.logging.info(f"Selected Youtube Video: {selected_random_meta.video_id}, Duration: {audio_duration:.2f} seconds")
4448
4449 audio_quality_scores = self.audio_score.total_score(
4450 audio_array,
4451 sr,
4452 selected_random_meta.diar_timestamps_start,
4453 selected_random_meta.diar_timestamps_end,
4454 selected_random_meta.diar_speakers
4455 )
4456 audio_quality_total_score = (
4457 audio_quality_scores["speech_content_score"] * SPEECH_CONTENT_SCALING_FACTOR +
4458 audio_quality_scores["speaker_dominance_score"] * SPEAKER_DOMINANCE_SCALING_FACTOR +
4459 audio_quality_scores["background_noise_score"] * BACKGROUND_NOISE_SCALING_FACTOR +
4460 audio_quality_scores["unique_speakers_error"] * UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR
4461 )
4462 # query score
4463
4464 ## diarization segment
4465 miner_diar_segment = {
4466 "start": selected_random_meta.diar_timestamps_start,
4467 "end": selected_random_meta.diar_timestamps_end,
4468 "speakers": selected_random_meta.diar_speakers
4469 }
4470
4471 diarization_score = calculate_diarization_metrics(
4472 audio_array,
4473 sr,
4474 miner_diar_segment
4475 )
4476 inverse_der = diarization_score["inverse_der"]
4477 total_score = (
4478 DIARIZATION_SCALING_FACTOR * inverse_der +
4479 AUDIO_LENGTH_SCALING_FACTOR * audio_length_score +
4480 AUDIO_QUALITY_SCALING_FACTOR * audio_quality_total_score +
4481 AUDIO_QUERY_RELEVANCE_SCALING_FACTOR * audio_query_score
4482 )
4483
4484 bt.logging.info(
4485 f"total_score: {total_score}, "
4486 f"inverse_der: {inverse_der}, "
4487 f"audio_length_score: {audio_length_score}, "
4488 f"audio_quality_total_score: {audio_quality_total_score}, "
4489 f"audio_query_score: {audio_query_score}"
4490 )
4491 # Upload our final results to API endpoint for index and dataset insertion. Include leaderboard statistics
4492 miner_hotkey = audios.axon.hotkey
4493 bt.logging.info(f"Uploading audio metadata for miner: {miner_hotkey}")
4494 upload_result = await self.upload_audio_metadata(metadata, inverse_der, audio_length_score, audio_quality_total_score, audio_query_score, audios.query, total_score, miner_hotkey)
4495 if upload_result:
4496 bt.logging.info("Uploading of audio metadata successful.")
4497 else:
4498 bt.logging.error("Issue uploading audio metadata.")
4499 return total_score
4500
4501
4502 except Exception as e:
4503 bt.logging.error(f"Error in check_audios_and_calculate_rewards: {e}")
4504 traceback.print_exc()
4505 return None
4506
4507
4508 async def reward(self, input_synapse: Videos, response: Videos) -> float:
4509 """
4510 Reward the miner response to the query. This method returns a reward
4511 value for the miner, which is used to update the miner's score.
4512
4513 Returns:
4514 - float: The reward value for the miner.
4515 """
4516 keypair = self.dendrite.keypair
4517 hotkey = keypair.ss58_address
4518 signature = f"0x{keypair.sign(hotkey).hex()}"
4519
4520 try:
4521 async with ClientSession() as session:
4522 async with session.post(
4523 self.validation_endpoint,
4524 auth=BasicAuth(hotkey, signature),
4525 json=response.to_serializable_dict(input_synapse),
4526 ) as response:
4527 response.raise_for_status()
4528 score = await response.json()
4529 return score
4530 except Exception as e:
4531 bt.logging.debug(f"Error in reward: {e}")
4532 traceback.print_exc()
4533 return None
4534
4535 async def get_rewards(
4536 self,
4537 input_synapse: Videos,
4538 responses: List[Videos],
4539 ) -> torch.FloatTensor:
4540 """
4541 Returns a tensor of rewards for the given query and responses.
4542 """
4543 # Get all the reward results by iteratively calling your reward() function.
4544 rewards = await asyncio.gather(*[
4545 self.reward(
4546 input_synapse,
4547 response,
4548 )
4549 for response in responses
4550 ])
4551 return rewards
4552
4553 """
4554 {
4555 "5DaNytPVo6uFZFr2f9pZ6ck2gczNyYebLgrYZoFuccPS6qMi": {
4556 "purchased_videos": [{
4557 "video_id": "bcdb8247-2261-4268-af9c-1275101730d5",
4558 "task_id": "salman_test",
4559 "user_email": "[email protected]",
4560 "video_score": 0.408363,
4561 "video_details": {
4562 "description": "This is a random score, testing purposes only",
4563 "focusing_task": "focusing on nothing!",
4564 "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
4565 },
4566 "rejection_reason": null,
4567 "expected_reward_tao": 0.0816726,
4568 "earned_reward_tao": 0.0816726,
4569 "created_at": "2024-09-03T16:18:03",
4570 "updated_at": "2024-09-03T16:28:20",
4571 "deleted_at": null,
4572 "processing_state": "PURCHASED",
4573 "miner_uid": null,
4574 "miner_hotkey": "5DaNytPVo6uFZFr2f9pZ6ck2gczNyYebLgrYZoFuccPS6qMi"
4575 }
4576 ],
4577 "total_focus_points": 127.2251,
4578 "max_focus_points": 1000.0,
4579 "focus_points_percentage": 0.1272251
4580 }
4581 }
4582 """
4583 async def get_focus_videos(self, miner_hotkeys: List[str], miner_uids: List[int]) -> List[Dict]:
4584 bt.logging.debug(f"Making API call to get focus videos for {miner_hotkeys}")
4585 miner_hotkeys_str = ",".join(miner_hotkeys)
4586
4587 async with ClientSession() as session:
4588 try:
4589 async with session.get(f"{self.focus_miner_purchases_endpoint}/{miner_hotkeys_str}", timeout=10) as response:
4590 if response.status == 200:
4591 res_data = await response.json()
4592 if len(res_data) == 0:
4593 bt.logging.debug(f"-- No focus videos found for {miner_hotkeys}")
4594 return []
4595
4596 result = []
4597 for i, miner_hotkey in enumerate(miner_hotkeys):
4598 if miner_hotkey in res_data:
4599 miner_data = res_data[miner_hotkey]
4600 miner_data['miner_hotkey'] = miner_hotkey
4601 miner_data['miner_uid'] = miner_uids[i]
4602 result.append(miner_data)
4603 if len(miner_data["purchased_videos"]) == 0:
4604 bt.logging.debug(f"-- No focus videos found for {miner_hotkey}")
4605 else:
4606 bt.logging.debug(f"-- No data found for {miner_hotkey}")
4607
4608 return result
4609 else:
4610 error_message = await response.text()
4611 bt.logging.warning(f"Retrieving miner focus videos failed. Status: {response.status}, Message: {error_message}")
4612 return []
4613 except asyncio.TimeoutError:
4614 bt.logging.error("Request timed out in get_focus_videos")
4615 return []
4616 except Exception as e:
4617 bt.logging.error(f"Error in get_focus_videos: {e}")
4618 traceback.print_exc()
4619 return []
4620
4621# The main function parses the configuration and runs the validator.
4622if __name__ == "__main__":
4623 Validator().run()
4624
4625
4626---
4627File: /omega/api/examples/subnet21.py
4628---
4629
4630# The MIT License (MIT)
4631# Copyright © 2021 Yuma Rao
4632# Copyright © 2023 Opentensor Foundation
4633# Copyright © 2023 Opentensor Technologies Inc
4634
4635# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
4636# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
4637# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
4638# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4639
4640# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
4641# the Software.
4642
4643# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
4644# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
4645# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
4646# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
4647# DEALINGS IN THE SOFTWARE.
4648
4649import torch
4650import base64
4651import bittensor as bt
4652from abc import ABC, abstractmethod
4653from typing import Any, List, Union
4654from bittensor.subnets import SubnetsAPI
4655
4656try:
4657 from storage.validator.cid import generate_cid_string
4658 from storage.validator.encryption import (
4659 encrypt_data,
4660 decrypt_data_with_private_key,
4661 )
4662except:
4663 storage_url = "https://github.com/ifrit98/storage-subnet"
4664 bt.logging.error(
4665 f"Storage Subnet 21 not installed. Please visit: {storage_url} and install the package to use this example."
4666 )
4667
4668
4669class StoreUserAPI(SubnetsAPI):
4670 def __init__(self, wallet: "bt.wallet"):
4671 super().__init__(wallet)
4672 self.netuid = 21
4673
4674 def prepare_synapse(
4675 self,
4676 data: bytes,
4677 encrypt=False,
4678 ttl=60 * 60 * 24 * 30,
4679 encoding="utf-8",
4680 ) -> StoreUser:
4681 data = bytes(data, encoding) if isinstance(data, str) else data
4682 encrypted_data, encryption_payload = (
4683 encrypt_data(data, self.wallet) if encrypt else (data, "{}")
4684 )
4685 expected_cid = generate_cid_string(encrypted_data)
4686 encoded_data = base64.b64encode(encrypted_data)
4687
4688 synapse = StoreUser(
4689 encrypted_data=encoded_data,
4690 encryption_payload=encryption_payload,
4691 ttl=ttl,
4692 )
4693
4694 return synapse
4695
4696 def process_responses(
4697 self, responses: List[Union["bt.Synapse", Any]]
4698 ) -> str:
4699 success = False
4700 failure_modes = {"code": [], "message": []}
4701 for response in responses:
4702 if response.dendrite.status_code != 200:
4703 failure_modes["code"].append(response.dendrite.status_code)
4704 failure_modes["message"].append(
4705 response.dendrite.status_message
4706 )
4707 continue
4708
4709 stored_cid = (
4710 response.data_hash.decode("utf-8")
4711 if isinstance(response.data_hash, bytes)
4712 else response.data_hash
4713 )
4714 bt.logging.debug("received data CID: {}".format(stored_cid))
4715 success = True
4716 break
4717
4718 if success:
4719 bt.logging.info(
4720 f"Stored data on the Bittensor network with CID {stored_cid}"
4721 )
4722 else:
4723 bt.logging.error(
4724 f"Failed to store data. Response failure codes & messages {failure_modes}"
4725 )
4726 stored_cid = ""
4727
4728 return stored_cid
4729
4730
4731class RetrieveUserAPI(SubnetsAPI):
4732 def __init__(self, wallet: "bt.wallet"):
4733 super().__init__(wallet)
4734 self.netuid = 21
4735
4736 def prepare_synapse(self, cid: str) -> RetrieveUser:
4737 synapse = RetrieveUser(data_hash=cid)
4738 return synapse
4739
4740 def process_responses(
4741 self, responses: List[Union["bt.Synapse", Any]]
4742 ) -> bytes:
4743 success = False
4744 decrypted_data = b""
4745 for response in responses:
4746 bt.logging.trace(f"response: {response.dendrite.dict()}")
4747 if (
4748 response.dendrite.status_code != 200
4749 or response.encrypted_data is None
4750 ):
4751 continue
4752
4753 # Decrypt the response
4754 bt.logging.trace(
4755 f"encrypted_data: {response.encrypted_data[:100]}"
4756 )
4757 encrypted_data = base64.b64decode(response.encrypted_data)
4758 bt.logging.debug(
4759 f"encryption_payload: {response.encryption_payload}"
4760 )
4761 if (
4762 response.encryption_payload is None
4763 or response.encryption_payload == ""
4764 or response.encryption_payload == "{}"
4765 ):
4766 bt.logging.warning(
4767 "No encryption payload found. Unencrypted data."
4768 )
4769 decrypted_data = encrypted_data
4770 else:
4771 decrypted_data = decrypt_data_with_private_key(
4772 encrypted_data,
4773 response.encryption_payload,
4774 bytes(self.wallet.coldkey.private_key.hex(), "utf-8"),
4775 )
4776 bt.logging.trace(f"decrypted_data: {decrypted_data[:100]}")
4777 success = True
4778 break
4779
4780 if success:
4781 bt.logging.info(
4782 f"Returning retrieved data: {decrypted_data[:100]}"
4783 )
4784 else:
4785 bt.logging.error("Failed to retrieve data.")
4786
4787 return decrypted_data
4788
4789
4790async def test_store_and_retrieve(
4791 netuid: int = 22, wallet: "bt.wallet" = None
4792):
4793 # Example usage
4794 wallet = wallet or bt.wallet()
4795
4796 # Instantiate the handler
4797 store_handler = StoreUserAPI(wallet)
4798
4799 # Fetch the axons you want to query
4800 metagraph = bt.subtensor("test").metagraph(netuid=22)
4801 query_axons = metagraph.axons
4802
4803 cid = await store_handler(
4804 axons=query_axons,
4805 # any arguments for the proper synapse
4806 data=b"some data",
4807 encrypt=True,
4808 ttl=60 * 60 * 24 * 30,
4809 encoding="utf-8",
4810 uid=None,
4811 )
4812 print("CID:", cid)
4813
4814 retrieve_handler = RetrieveUserAPI(wallet)
4815 retrieve_response = await retrieve_handler(axons=query_axons, cid=cid)
4816
4817
4818
4819---
4820File: /omega/api/__init__.py
4821---
4822
4823
4824
4825
4826---
4827File: /omega/api/dummy.py
4828---
4829
4830# The MIT License (MIT)
4831# Copyright © 2021 Yuma Rao
4832# Copyright © 2023 Opentensor Foundation
4833# Copyright © 2023 Opentensor Technologies Inc
4834
4835# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
4836# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
4837# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
4838# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4839
4840# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
4841# the Software.
4842
4843# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
4844# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
4845# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
4846# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
4847# DEALINGS IN THE SOFTWARE.
4848
4849import bittensor as bt
4850from typing import List, Optional, Union, Any, Dict
4851from omega.protocol import Dummy
4852from bittensor.subnets import SubnetsAPI
4853
4854
4855class DummyAPI(SubnetsAPI):
4856 def __init__(self, wallet: "bt.wallet"):
4857 super().__init__(wallet)
4858 self.netuid = 33
4859 self.name = "dummy"
4860
4861 def prepare_synapse(self, dummy_input: int) -> Dummy:
4862 synapse.dummy_input = dummy_input
4863 return synapse
4864
4865 def process_responses(
4866 self, responses: List[Union["bt.Synapse", Any]]
4867 ) -> List[int]:
4868 outputs = []
4869 for response in responses:
4870 if response.dendrite.status_code != 200:
4871 continue
4872 return outputs.append(response.dummy_output)
4873 return outputs
4874
4875
4876
4877---
4878File: /omega/api/get_query_axons.py
4879---
4880
4881# The MIT License (MIT)
4882# Copyright © 2021 Yuma Rao
4883# Copyright © 2023 Opentensor Foundation
4884# Copyright © 2023 Opentensor Technologies Inc
4885
4886# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
4887# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
4888# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
4889# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4890
4891# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
4892# the Software.
4893
4894# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
4895# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
4896# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
4897# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
4898# DEALINGS IN THE SOFTWARE.
4899
4900import torch
4901import random
4902import bittensor as bt
4903
4904
4905async def ping_uids(dendrite, metagraph, uids, timeout=3):
4906 """
4907 Pings a list of UIDs to check their availability on the Bittensor network.
4908
4909 Args:
4910 dendrite (bittensor.dendrite): The dendrite instance to use for pinging nodes.
4911 metagraph (bittensor.metagraph): The metagraph instance containing network information.
4912 uids (list): A list of UIDs (unique identifiers) to ping.
4913 timeout (int, optional): The timeout in seconds for each ping. Defaults to 3.
4914
4915 Returns:
4916 tuple: A tuple containing two lists:
4917 - The first list contains UIDs that were successfully pinged.
4918 - The second list contains UIDs that failed to respond.
4919 """
4920 axons = [metagraph.axons[uid] for uid in uids]
4921 try:
4922 responses = await dendrite(
4923 axons,
4924 bt.Synapse(), # TODO: potentially get the synapses available back?
4925 deserialize=False,
4926 timeout=timeout,
4927 )
4928 successful_uids = [
4929 uid
4930 for uid, response in zip(uids, responses)
4931 if response.dendrite.status_code == 200
4932 ]
4933 failed_uids = [
4934 uid
4935 for uid, response in zip(uids, responses)
4936 if response.dendrite.status_code != 200
4937 ]
4938 except Exception as e:
4939 bt.logging.error(f"Dendrite ping failed: {e}")
4940 successful_uids = []
4941 failed_uids = uids
4942 bt.logging.debug("ping() successful uids:", successful_uids)
4943 bt.logging.debug("ping() failed uids :", failed_uids)
4944 return successful_uids, failed_uids
4945
4946
4947async def get_query_api_nodes(dendrite, metagraph, n=0.1, timeout=3):
4948 """
4949 Fetches the available API nodes to query for the particular subnet.
4950
4951 Args:
4952 wallet (bittensor.wallet): The wallet instance to use for querying nodes.
4953 metagraph (bittensor.metagraph): The metagraph instance containing network information.
4954 n (float, optional): The fraction of top nodes to consider based on stake. Defaults to 0.1.
4955 timeout (int, optional): The timeout in seconds for pinging nodes. Defaults to 3.
4956
4957 Returns:
4958 list: A list of UIDs representing the available API nodes.
4959 """
4960 bt.logging.debug(
4961 f"Fetching available API nodes for subnet {metagraph.netuid}"
4962 )
4963 vtrust_uids = [
4964 uid.item()
4965 for uid in metagraph.uids
4966 if metagraph.validator_trust[uid] > 0
4967 ]
4968 top_uids = torch.where(metagraph.S > torch.quantile(metagraph.S, 1 - n))
4969 top_uids = top_uids[0].tolist()
4970 init_query_uids = set(top_uids).intersection(set(vtrust_uids))
4971 query_uids, _ = await ping_uids(
4972 dendrite, metagraph, init_query_uids, timeout=timeout
4973 )
4974 bt.logging.debug(
4975 f"Available API node UIDs for subnet {metagraph.netuid}: {query_uids}"
4976 )
4977 if len(query_uids) > 3:
4978 query_uids = random.sample(query_uids, 3)
4979 return query_uids
4980
4981
4982async def get_query_api_axons(
4983 wallet, metagraph=None, n=0.1, timeout=3, uids=None
4984):
4985 """
4986 Retrieves the axons of query API nodes based on their availability and stake.
4987
4988 Args:
4989 wallet (bittensor.wallet): The wallet instance to use for querying nodes.
4990 metagraph (bittensor.metagraph, optional): The metagraph instance containing network information.
4991 n (float, optional): The fraction of top nodes to consider based on stake. Defaults to 0.1.
4992 timeout (int, optional): The timeout in seconds for pinging nodes. Defaults to 3.
4993 uids (Union[List[int], int], optional): The specific UID(s) of the API node(s) to query. Defaults to None.
4994
4995 Returns:
4996 list: A list of axon objects for the available API nodes.
4997 """
4998 dendrite = bt.dendrite(wallet=wallet)
4999
5000 if metagraph is None:
5001 metagraph = bt.metagraph(netuid=21)
5002
5003 if uids is not None:
5004 query_uids = [uids] if isinstance(uids, int) else uids
5005 else:
5006 query_uids = await get_query_api_nodes(
5007 dendrite, metagraph, n=n, timeout=timeout
5008 )
5009 return [metagraph.axons[uid] for uid in query_uids]
5010
5011
5012
5013---
5014File: /omega/base/__init__.py
5015---
5016
5017
5018
5019
5020---
5021File: /omega/base/miner.py
5022---
5023
5024# The MIT License (MIT)
5025# Copyright © 2023 Yuma Rao
5026
5027# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
5028# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
5029# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
5030# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5031
5032# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
5033# the Software.
5034
5035# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
5036# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
5037# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
5038# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
5039# DEALINGS IN THE SOFTWARE.
5040
5041import time
5042import asyncio
5043import threading
5044import argparse
5045import traceback
5046import datetime as dt
5047
5048import bittensor as bt
5049
5050from omega.base.neuron import BaseNeuron
5051from omega.utils.config import add_miner_args
5052
5053
5054class BaseMinerNeuron(BaseNeuron):
5055 """
5056 Base class for Bittensor miners.
5057 """
5058
5059 neuron_type: str = "MinerNeuron"
5060
5061 @classmethod
5062 def add_args(cls, parser: argparse.ArgumentParser):
5063 super().add_args(parser)
5064 add_miner_args(cls, parser)
5065
5066 def __init__(self, config=None):
5067 super().__init__(config=config)
5068
5069 # Warn if allowing incoming requests from anyone.
5070 if not self.config.blacklist.force_validator_permit:
5071 bt.logging.warning(
5072 "You are allowing non-validators to send requests to your miner. This is a security risk."
5073 )
5074 if self.config.blacklist.allow_non_registered:
5075 bt.logging.warning(
5076 "You are allowing non-registered entities to send requests to your miner. This is a security risk."
5077 )
5078
5079 # The axon handles request processing, allowing validators to send this miner requests.
5080 self.axon = bt.axon(wallet=self.wallet, config=self.config)
5081
5082 # Attach determiners which functions are called when servicing a request.
5083 bt.logging.info(f"Attaching forward function to miner axon.")
5084 self.axon.attach(
5085 forward_fn=self.forward_videos,
5086 blacklist_fn=self.blacklist_videos,
5087 priority_fn=self.priority_videos,
5088 ).attach(
5089 forward_fn=self.forward_audios,
5090 blacklist_fn=self.blacklist_audios,
5091 priority_fn=self.priority_audios,
5092 )
5093 bt.logging.info(f"Axon created: {self.axon}")
5094
5095 # Instantiate runners
5096 self.should_exit: bool = False
5097 self.is_running: bool = False
5098 self.thread: threading.Thread = None
5099 self.lock = asyncio.Lock()
5100
5101 def run(self):
5102 """
5103 Initiates and manages the main loop for the miner on the Bittensor network. The main loop handles graceful shutdown on keyboard interrupts and logs unforeseen errors.
5104
5105 This function performs the following primary tasks:
5106 1. Check for registration on the Bittensor network.
5107 2. Starts the miner's axon, making it active on the network.
5108 3. Periodically resynchronizes with the chain; updating the metagraph with the latest network state and setting weights.
5109
5110 The miner continues its operations until `should_exit` is set to True or an external interruption occurs.
5111 During each epoch of its operation, the miner waits for new blocks on the Bittensor network, updates its
5112 knowledge of the network (metagraph), and sets its weights. This process ensures the miner remains active
5113 and up-to-date with the network's latest state.
5114
5115 Note:
5116 - The function leverages the global configurations set during the initialization of the miner.
5117 - The miner's axon serves as its interface to the Bittensor network, handling incoming and outgoing requests.
5118
5119 Raises:
5120 KeyboardInterrupt: If the miner is stopped by a manual interruption.
5121 Exception: For unforeseen errors during the miner's operation, which are logged for diagnosis.
5122 """
5123
5124 # Check that miner is registered on the network.
5125 self.sync()
5126
5127 # Serve passes the axon information to the network + netuid we are hosting on.
5128 # This will auto-update if the axon port of external ip have changed.
5129 bt.logging.info(
5130 f"Serving miner axon {self.axon} on network: {self.config.subtensor.chain_endpoint} with netuid: {self.config.netuid}"
5131 )
5132 self.axon.serve(netuid=self.config.netuid, subtensor=self.subtensor)
5133
5134 # Start starts the miner's axon, making it active on the network.
5135 self.axon.start()
5136
5137 bt.logging.info(f"Miner starting at block: {self.block}")
5138
5139 # This loop maintains the miner's operations until intentionally stopped.
5140 try:
5141 while not self.should_exit:
5142 while (
5143 dt.datetime.now() - self.last_sync_check
5144 ).total_seconds() < self.sync_check_interval:
5145
5146 # Wait before checking again.
5147 time.sleep(1)
5148
5149 # Check if we should exit.
5150 if self.should_exit:
5151 break
5152
5153 # Sync metagraph and potentially set weights.
5154 self.sync()
5155 self.step += 1
5156
5157 # If someone intentionally stops the miner, it'll safely terminate operations.
5158 except KeyboardInterrupt:
5159 self.axon.stop()
5160 bt.logging.success("Miner killed by keyboard interrupt.")
5161 exit()
5162
5163 # In case of unforeseen errors, the miner will log the error and continue operations.
5164 except Exception as e:
5165 bt.logging.error(traceback.format_exc())
5166
5167 def run_in_background_thread(self):
5168 """
5169 Starts the miner's operations in a separate background thread.
5170 This is useful for non-blocking operations.
5171 """
5172 if not self.is_running:
5173 bt.logging.debug("Starting miner in background thread.")
5174 self.should_exit = False
5175 self.thread = threading.Thread(target=self.run, daemon=True)
5176 self.thread.start()
5177 self.is_running = True
5178 bt.logging.debug("Started")
5179
5180 def stop_run_thread(self):
5181 """
5182 Stops the miner's operations that are running in the background thread.
5183 """
5184 if self.is_running:
5185 bt.logging.debug("Stopping miner in background thread.")
5186 self.should_exit = True
5187 self.thread.join(5)
5188 self.is_running = False
5189 bt.logging.debug("Stopped")
5190
5191 def __enter__(self):
5192 """
5193 Starts the miner's operations in a background thread upon entering the context.
5194 This method facilitates the use of the miner in a 'with' statement.
5195 """
5196 self.run_in_background_thread()
5197 return self
5198
5199 def __exit__(self, exc_type, exc_value, traceback):
5200 """
5201 Stops the miner's background operations upon exiting the context.
5202 This method facilitates the use of the miner in a 'with' statement.
5203
5204 Args:
5205 exc_type: The type of the exception that caused the context to be exited.
5206 None if the context was exited without an exception.
5207 exc_value: The instance of the exception that caused the context to be exited.
5208 None if the context was exited without an exception.
5209 traceback: A traceback object encoding the stack trace.
5210 None if the context was exited without an exception.
5211 """
5212 self.stop_run_thread()
5213
5214 def resync_metagraph(self):
5215 """Resyncs the metagraph and updates the hotkeys and moving averages based on the new metagraph."""
5216 bt.logging.info("resync_metagraph(self)")
5217
5218 # Sync the metagraph.
5219 self.metagraph.sync(subtensor=self.subtensor)
5220
5221
5222
5223---
5224File: /omega/base/neuron.py
5225---
5226
5227# The MIT License (MIT)
5228# Copyright © 2023 Yuma Rao
5229
5230# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
5231# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
5232# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
5233# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5234
5235# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
5236# the Software.
5237
5238# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
5239# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
5240# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
5241# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
5242# DEALINGS IN THE SOFTWARE.
5243
5244import copy
5245import typing
5246import datetime as dt
5247
5248import bittensor as bt
5249
5250from abc import ABC, abstractmethod
5251
5252# Sync calls set weights and also resyncs the metagraph.
5253from omega.utils.config import check_config, add_args, config
5254from omega.utils.misc import ttl_get_block
5255from omega import __spec_version__ as spec_version
5256from omega.mock import MockSubtensor, MockMetagraph
5257
5258
5259class BaseNeuron(ABC):
5260 """
5261 Base class for Bittensor miners. This class is abstract and should be inherited by a subclass. It contains the core logic for all neurons; validators and miners.
5262
5263 In addition to creating a wallet, subtensor, and metagraph, this class also handles the synchronization of the network state via a basic checkpointing mechanism based on epoch length.
5264 """
5265
5266 neuron_type: str = "BaseNeuron"
5267
5268 @classmethod
5269 def check_config(cls, config: "bt.Config"):
5270 check_config(cls, config)
5271
5272 @classmethod
5273 def add_args(cls, parser):
5274 add_args(cls, parser)
5275
5276 @classmethod
5277 def config(cls):
5278 return config(cls)
5279
5280 subtensor: "bt.subtensor"
5281 wallet: "bt.wallet"
5282 metagraph: "bt.metagraph"
5283 spec_version: int = spec_version
5284
5285 @property
5286 def block(self):
5287 return ttl_get_block(self)
5288
5289 def __init__(self, config=None):
5290 base_config = copy.deepcopy(config or BaseNeuron.config())
5291 self.config = self.config()
5292 self.config.merge(base_config)
5293 self.check_config(self.config)
5294
5295 # Set up logging with the provided configuration.
5296 bt.logging.set_config(config=self.config.logging)
5297
5298 # If a gpu is required, set the device to cuda:N (e.g. cuda:0)
5299 self.device = self.config.neuron.device
5300
5301 # Log the configuration for reference.
5302 bt.logging.info(self.config)
5303
5304 # Build Bittensor objects
5305 # These are core Bittensor classes to interact with the network.
5306 bt.logging.info("Setting up bittensor objects.")
5307
5308 # The wallet holds the cryptographic key pairs for the miner.
5309 if self.config.mock:
5310 self.wallet = bt.MockWallet(config=self.config)
5311 self.subtensor = MockSubtensor(
5312 self.config.netuid, wallet=self.wallet
5313 )
5314 self.metagraph = MockMetagraph(
5315 self.config.netuid, subtensor=self.subtensor
5316 )
5317 else:
5318 self.wallet = bt.wallet(config=self.config)
5319 self.subtensor = bt.subtensor(config=self.config)
5320 self.metagraph = self.subtensor.metagraph(self.config.netuid)
5321
5322 bt.logging.info(f"Wallet: {self.wallet}")
5323 bt.logging.info(f"Subtensor: {self.subtensor}")
5324 bt.logging.info(f"Metagraph: {self.metagraph}")
5325
5326 # Check if the miner is registered on the Bittensor network before proceeding further.
5327 self.check_registered()
5328
5329 # Each miner gets a unique identity (UID) in the network for differentiation.
5330 self.uid = self.metagraph.hotkeys.index(
5331 self.wallet.hotkey.ss58_address
5332 )
5333 bt.logging.info(
5334 f"Running neuron on subnet: {self.config.netuid} with uid {self.uid} using network: {self.subtensor.chain_endpoint}"
5335 )
5336 self.step = 0
5337
5338 self.last_sync_check = dt.datetime.now()
5339 self.sync_check_interval = 300 # 5 minutes
5340
5341 # @abstractmethod
5342 # async def forward(self, synapse: bt.Synapse) -> bt.Synapse:
5343 # ...
5344
5345 @abstractmethod
5346 def run(self):
5347 ...
5348
5349 def sync(self):
5350 """
5351 Wrapper for synchronizing the state of the network for the given miner or validator.
5352 """
5353 # Ensure miner or validator hotkey is still registered on the network.
5354 try:
5355 self.check_registered()
5356 except Exception as e:
5357 bt.logging.error(f"Error checking registration status: {e}. Continuing incase it is a temporary subtensor connection issue.")
5358
5359 if self.should_sync_metagraph():
5360 self.resync_metagraph()
5361
5362 if self.should_set_weights():
5363 self.set_weights()
5364
5365 # Always save state.
5366 self.save_state()
5367
5368 # Update the last sync check time.
5369 self.last_sync_check = dt.datetime.now()
5370
5371 def check_registered(self):
5372 # --- Check for registration.
5373 if not self.subtensor.is_hotkey_registered(
5374 netuid=self.config.netuid,
5375 hotkey_ss58=self.wallet.hotkey.ss58_address,
5376 ):
5377 bt.logging.error(
5378 f"Wallet: {self.wallet} is not registered on netuid {self.config.netuid}."
5379 f" Please register the hotkey using `btcli subnets register` before trying again"
5380 )
5381 exit()
5382
5383 def should_sync_metagraph(self):
5384 """
5385 Check if enough epoch blocks have elapsed since the last checkpoint to sync.
5386 """
5387 return (
5388 self.block - self.metagraph.last_update[self.uid]
5389 ) > self.config.neuron.epoch_length
5390
5391 def should_set_weights(self) -> bool:
5392 # Don't set weights on initialization.
5393 if self.step == 0:
5394 return False
5395
5396 # Check if enough epoch blocks have elapsed since the last epoch.
5397 if self.config.neuron.disable_set_weights:
5398 return False
5399
5400 # Define appropriate logic for when set weights.
5401 return (
5402 (self.block - self.metagraph.last_update[self.uid])
5403 > self.config.neuron.epoch_length
5404 and self.neuron_type != "MinerNeuron"
5405 )
5406
5407 def save_state(self):
5408 bt.logging.warning(
5409 "save_state() not implemented for this neuron. You can implement this function to save model checkpoints or other useful data."
5410 )
5411
5412 def load_state(self):
5413 bt.logging.warning(
5414 "load_state() not implemented for this neuron. You can implement this function to load model checkpoints or other useful data."
5415 )
5416
5417
5418
5419---
5420File: /omega/base/validator.py
5421---
5422
5423# The MIT License (MIT)
5424# Copyright © 2023 Yuma Rao
5425# Copyright © 2023 Omega Labs, Inc.
5426
5427# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
5428# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
5429# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
5430# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5431
5432# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
5433# the Software.
5434
5435# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
5436# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
5437# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
5438# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
5439# DEALINGS IN THE SOFTWARE.
5440
5441
5442import copy
5443import torch
5444import asyncio
5445import argparse
5446import os
5447import threading
5448import datetime as dt
5449import bittensor as bt
5450from datetime import datetime
5451from subprocess import Popen, PIPE
5452
5453from typing import List
5454from traceback import print_exception
5455
5456from omega.base.neuron import BaseNeuron
5457from omega.mock import MockDendrite
5458from omega.utils.config import add_validator_args
5459from omega.constants import FOCUS_REWARDS_PERCENT, AUDIO_REWARDS_PERCENT
5460
5461
5462class BaseValidatorNeuron(BaseNeuron):
5463 """
5464 Base class for Bittensor validators. Your validator should inherit from this class.
5465 """
5466
5467 neuron_type: str = "ValidatorNeuron"
5468
5469 @classmethod
5470 def add_args(cls, parser: argparse.ArgumentParser):
5471 super().add_args(parser)
5472 add_validator_args(cls, parser)
5473
5474 def __init__(self, config=None):
5475 super().__init__(config=config)
5476
5477 # Save a copy of the hotkeys to local memory.
5478 self.hotkeys = copy.deepcopy(self.metagraph.hotkeys)
5479
5480 # Dendrite lets us send messages to other nodes (axons) in the network.
5481 if self.config.mock:
5482 self.dendrite = MockDendrite(wallet=self.wallet)
5483 else:
5484 self.dendrite = bt.dendrite(wallet=self.wallet)
5485 bt.logging.info(f"Dendrite: {self.dendrite}")
5486
5487 # Set up initial scoring weights for validation
5488 bt.logging.info("Building validation weights.")
5489 self.scores = torch.zeros(
5490 self.metagraph.n, dtype=torch.float32, device=self.device
5491 )
5492 self.focus_scores = torch.zeros(
5493 self.metagraph.n, dtype=torch.float32, device=self.device
5494 )
5495
5496 self.audio_score_arr = torch.zeros(
5497 self.metagraph.n, dtype=torch.float32, device=self.device
5498 )
5499
5500 # Serve axon to enable external connections.
5501 if not self.config.neuron.axon_off:
5502 self.serve_axon()
5503 else:
5504 bt.logging.warning("axon off, not serving ip to chain.")
5505
5506 if self.config.neuron.auto_update:
5507 bt.logging.info("Auto update enabled.")
5508 else:
5509 bt.logging.info("Auto update disabled.")
5510
5511 # Create asyncio event loop to manage async tasks.
5512 self.loop = asyncio.get_event_loop()
5513
5514 # Instantiate runners
5515 self.should_exit: bool = False
5516 self.is_running: bool = False
5517 self.thread: threading.Thread = None
5518 self.lock = asyncio.Lock()
5519 self.last_update_check = datetime.now()
5520 self.update_check_interval = 1800 # 30 minutes
5521
5522 def serve_axon(self):
5523 """Serve axon to enable external connections."""
5524
5525 bt.logging.info("serving ip to chain...")
5526 try:
5527 self.axon = bt.axon(wallet=self.wallet, config=self.config)
5528
5529 try:
5530 self.subtensor.serve_axon(
5531 netuid=self.config.netuid,
5532 axon=self.axon,
5533 )
5534 bt.logging.info(
5535 f"Running validator {self.axon} on network: {self.config.subtensor.chain_endpoint} with netuid: {self.config.netuid}"
5536 )
5537 except Exception as e:
5538 bt.logging.error(f"Failed to serve Axon with exception: {e}")
5539 pass
5540
5541 except Exception as e:
5542 bt.logging.error(
5543 f"Failed to create Axon initialize with exception: {e}"
5544 )
5545 pass
5546
5547 async def concurrent_forward(self):
5548 coroutines = [
5549 self.forward()
5550 for _ in range(self.config.neuron.num_concurrent_forwards)
5551 ]
5552 await asyncio.gather(*coroutines)
5553
5554 def is_git_latest(self) -> bool:
5555 p = Popen(['git', 'rev-parse', 'HEAD'], stdout=PIPE, stderr=PIPE)
5556 out, err = p.communicate()
5557 if err:
5558 return False
5559 current_commit = out.decode().strip()
5560 p = Popen(['git', 'ls-remote', 'origin', 'HEAD'], stdout=PIPE, stderr=PIPE)
5561 out, err = p.communicate()
5562 if err:
5563 return False
5564 latest_commit = out.decode().split()[0]
5565 bt.logging.info(f'Current commit: {current_commit}, Latest commit: {latest_commit}')
5566 return current_commit == latest_commit
5567
5568 def should_restart(self) -> bool:
5569 # Check if enough time has elapsed since the last update check, if not assume we are up to date.
5570 if (datetime.now() - self.last_update_check).seconds < self.update_check_interval:
5571 return False
5572
5573 self.last_update_check = datetime.now()
5574
5575 return not self.is_git_latest()
5576
5577 def run(self):
5578 """
5579 Initiates and manages the main loop for the miner on the Bittensor network. The main loop handles graceful shutdown on keyboard interrupts and logs unforeseen errors.
5580
5581 This function performs the following primary tasks:
5582 1. Check for registration on the Bittensor network.
5583 2. Continuously forwards queries to the miners on the network, rewarding their responses and updating the scores accordingly.
5584 3. Periodically resynchronizes with the chain; updating the metagraph with the latest network state and setting weights.
5585
5586 The essence of the validator's operations is in the forward function, which is called every step. The forward function is responsible for querying the network and scoring the responses.
5587
5588 Note:
5589 - The function leverages the global configurations set during the initialization of the miner.
5590 - The miner's axon serves as its interface to the Bittensor network, handling incoming and outgoing requests.
5591
5592 Raises:
5593 KeyboardInterrupt: If the miner is stopped by a manual interruption.
5594 Exception: For unforeseen errors during the miner's operation, which are logged for diagnosis.
5595 """
5596
5597 # Check that validator is registered on the network.
5598 self.sync()
5599
5600 bt.logging.info(f"Validator starting at block: {self.block}")
5601
5602 # This loop maintains the validator's operations until intentionally stopped.
5603 try:
5604 while True:
5605 bt.logging.info(f"step({self.step}) block({self.block})")
5606
5607 # Run multiple forwards concurrently.
5608 self.loop.run_until_complete(self.concurrent_forward())
5609
5610 # Check if we should exit.
5611 if self.should_exit:
5612 break
5613
5614 if self.config.neuron.auto_update and self.should_restart():
5615 bt.logging.info(f'Validator is out of date, quitting to restart.')
5616 raise KeyboardInterrupt
5617
5618 # Sync metagraph and potentially set weights.
5619 self.sync()
5620
5621 self.step += 1
5622
5623 # Check if we should start a new wandb run.
5624 if not self.config.wandb.off and self.successfully_started_wandb:
5625 if (dt.datetime.now() - self.wandb_run_start) >= dt.timedelta(
5626 hours=6
5627 ):
5628 bt.logging.info(
5629 "Current wandb run is more than 6 hours old. Starting a new run."
5630 )
5631 self.wandb_run.finish()
5632 self.new_wandb_run()
5633
5634 # Check if we should reload the topics.
5635 if (dt.datetime.now() - self.load_topics_start) >= dt.timedelta(
5636 hours=1
5637 ):
5638 bt.logging.info("Reloading topics after 1 hour.")
5639 self.all_topics = self.load_topics()
5640 self.load_topics_start = dt.datetime.now()
5641
5642 # Check if we should reload the focus videos rewards percentage.
5643 if (dt.datetime.now() - self.load_focus_rewards_start) >= dt.timedelta(
5644 hours=1
5645 ):
5646 bt.logging.info("Reloading focus videos rewards percent after 1 hour.")
5647 self.FOCUS_REWARDS_PERCENT = self.load_focus_rewards_percent()
5648 self.AUDIO_REWARDS_PERCENT = AUDIO_REWARDS_PERCENT
5649 self.YOUTUBE_REWARDS_PERCENT = 1.0 - self.FOCUS_REWARDS_PERCENT - self.AUDIO_REWARDS_PERCENT
5650 self.load_focus_rewards_start = dt.datetime.now()
5651
5652 # If someone intentionally stops the validator, it'll safely terminate operations.
5653 except KeyboardInterrupt:
5654 self.axon.stop()
5655 bt.logging.success("Validator killed by keyboard interrupt.")
5656 exit()
5657
5658 # In case of unforeseen errors, the validator will log the error and continue operations.
5659 except Exception as err:
5660 bt.logging.error("Error during validation", str(err))
5661 bt.logging.debug(
5662 print_exception(type(err), err, err.__traceback__)
5663 )
5664
5665 def run_in_background_thread(self):
5666 """
5667 Starts the validator's operations in a background thread upon entering the context.
5668 This method facilitates the use of the validator in a 'with' statement.
5669 """
5670 if not self.is_running:
5671 bt.logging.debug("Starting validator in background thread.")
5672 self.should_exit = False
5673 self.thread = threading.Thread(target=self.run, daemon=True)
5674 self.thread.start()
5675 self.is_running = True
5676 bt.logging.debug("Started")
5677
5678 def stop_run_thread(self):
5679 """
5680 Stops the validator's operations that are running in the background thread.
5681 """
5682 if self.is_running:
5683 bt.logging.debug("Stopping validator in background thread.")
5684 self.should_exit = True
5685 self.thread.join(5)
5686 self.is_running = False
5687 bt.logging.debug("Stopped")
5688
5689 def __enter__(self):
5690 self.run_in_background_thread()
5691 return self
5692
5693 def __exit__(self, exc_type, exc_value, traceback):
5694 """
5695 Stops the validator's background operations upon exiting the context.
5696 This method facilitates the use of the validator in a 'with' statement.
5697
5698 Args:
5699 exc_type: The type of the exception that caused the context to be exited.
5700 None if the context was exited without an exception.
5701 exc_value: The instance of the exception that caused the context to be exited.
5702 None if the context was exited without an exception.
5703 traceback: A traceback object encoding the stack trace.
5704 None if the context was exited without an exception.
5705 """
5706 if self.is_running:
5707 bt.logging.debug("Stopping validator in background thread.")
5708 self.should_exit = True
5709 self.thread.join(5)
5710 self.is_running = False
5711 bt.logging.debug("Stopped")
5712
5713 def pad_tensors(self, tensor_a, tensor_b, tensor_c):
5714 # Ensure both tensors are on the same device
5715 device = tensor_a.device
5716 tensor_b = tensor_b.to(device)
5717 tensor_c = tensor_c.to(device)
5718 max_size = max(tensor_a.size(0), tensor_b.size(0), tensor_c.size(0))
5719 if tensor_a.size(0) < max_size:
5720 padding = torch.zeros(max_size - tensor_a.size(0), device=device)
5721 tensor_a = torch.cat((tensor_a, padding))
5722 print("tensor a was padded")
5723 if tensor_b.size(0) < max_size:
5724 padding = torch.zeros(max_size - tensor_b.size(0), device=device)
5725 tensor_b = torch.cat((tensor_b, padding))
5726 print("tensor b was padded")
5727 if tensor_c.size(0) < max_size:
5728 padding = torch.zeros(max_size - tensor_c.size(0), device=device)
5729 tensor_c = torch.cat((tensor_c, padding))
5730 print("tensor c was padded")
5731
5732 return tensor_a, tensor_b, tensor_c
5733
5734 def set_weights(self):
5735 """
5736 Sets the validator weights to the metagraph hotkeys based on the scores it has received from the miners. The weights determine the trust and incentive level the validator assigns to miner nodes on the network.
5737 """
5738
5739 # Check if self.scores contains any NaN values and log a warning if it does.
5740 if torch.isnan(self.scores).any():
5741 bt.logging.warning(
5742 f"Scores contain NaN values. This may be due to a lack of responses from miners, or a bug in your reward functions."
5743 )
5744
5745 self.scores, self.focus_scores, self.audio_score_arr = self.pad_tensors(self.scores, self.focus_scores, self.audio_score_arr)
5746
5747 bt.logging.debug(f"Normalizing scores with YOUTUBE_REWARDS_PERCENT: {self.YOUTUBE_REWARDS_PERCENT}, FOCUS_REWARDS_PERCENT: {self.FOCUS_REWARDS_PERCENT}, AUDIO_REWARDS_PERCENT: {self.AUDIO_REWARDS_PERCENT}")
5748 # Calculate the average reward for each uid across non-zero values.
5749 # Replace any NaN values with 0.
5750 # Normalize the youtube rewards and scale by the percentage.
5751 raw_weights_youtube = torch.nn.functional.normalize(self.scores, p=1, dim=0) * self.YOUTUBE_REWARDS_PERCENT
5752 # Normalize the focus rewards and scale by the percentage.
5753 raw_weights_focus = torch.nn.functional.normalize(self.focus_scores, p=1, dim=0) * self.FOCUS_REWARDS_PERCENT
5754 # Normalize the audio rewards and scale by the percentage.
5755 raw_weights_audio = torch.nn.functional.normalize(self.audio_score_arr, p=1, dim=0) * self.AUDIO_REWARDS_PERCENT
5756
5757 # Combine the youtube and focus rewards.
5758 raw_weights = raw_weights_youtube + raw_weights_focus + raw_weights_audio
5759
5760 bt.logging.debug("raw_weights_youtube", raw_weights_youtube)
5761 bt.logging.debug("raw_weights_focus", raw_weights_focus)
5762 bt.logging.debug("raw_weights_audio", raw_weights_audio)
5763 bt.logging.debug("raw_weights", raw_weights)
5764 bt.logging.debug("raw_weight_uids", self.metagraph.uids.to("cpu"))
5765 if raw_weights.shape[0] > self.metagraph.uids.shape[0]:
5766 bt.logging.warning("More raw_weights than metagraph uids, truncating raw_weights.")
5767 raw_weights = raw_weights[:self.metagraph.uids.shape[0]]
5768 # Process the raw weights to final_weights via subtensor limitations.
5769 try:
5770 (
5771 processed_weight_uids,
5772 processed_weights,
5773 ) = bt.utils.weight_utils.process_weights_for_netuid(
5774 uids=self.metagraph.uids.to("cpu"),
5775 weights=raw_weights.to("cpu"),
5776 netuid=self.config.netuid,
5777 subtensor=self.subtensor,
5778 metagraph=self.metagraph,
5779 )
5780 bt.logging.debug("processed_weights", processed_weights)
5781 bt.logging.debug("processed_weight_uids", processed_weight_uids)
5782 except Exception as e:
5783 bt.logging.error(f"Failed to process weights with exception: {e}, skipping set_weights this time")
5784 return
5785
5786 # Convert to uint16 weights and uids.
5787 (
5788 uint_uids,
5789 uint_weights,
5790 ) = bt.utils.weight_utils.convert_weights_and_uids_for_emit(
5791 uids=processed_weight_uids, weights=processed_weights
5792 )
5793 bt.logging.debug("uint_weights", uint_weights)
5794 bt.logging.debug("uint_uids", uint_uids)
5795
5796 # Set the weights on chain via our subtensor connection.
5797 result, result_msg = self.subtensor.set_weights(
5798 wallet=self.wallet,
5799 netuid=self.config.netuid,
5800 uids=uint_uids,
5801 weights=uint_weights,
5802 wait_for_finalization=False,
5803 wait_for_inclusion=False,
5804 version_key=self.spec_version,
5805 )
5806 if result is True:
5807 bt.logging.info("set_weights on chain successfully!")
5808 else:
5809 bt.logging.error(f"set_weights failed with message: {result_msg}")
5810
5811 def resync_metagraph(self):
5812 """Resyncs the metagraph and updates the hotkeys and moving averages based on the new metagraph."""
5813 bt.logging.info("resync_metagraph()")
5814
5815 # Copies state of metagraph before syncing.
5816 previous_metagraph = copy.deepcopy(self.metagraph)
5817
5818 # Sync the metagraph.
5819 self.metagraph.sync(subtensor=self.subtensor)
5820
5821 # Check if the metagraph axon info has changed.
5822 if previous_metagraph.axons == self.metagraph.axons:
5823 return
5824
5825 bt.logging.info(
5826 "Metagraph updated, re-syncing hotkeys, dendrite pool and moving averages"
5827 )
5828 # Zero out all hotkeys that have been replaced.
5829 for uid, hotkey in enumerate(self.hotkeys):
5830 if hotkey != self.metagraph.hotkeys[uid]:
5831 self.scores[uid] = 0 # hotkey has been replaced
5832 self.focus_scores[uid] = 0 # hotkey has been replaced
5833
5834 # Check to see if the metagraph has changed size.
5835 # If so, we need to add new hotkeys and moving averages.
5836 if len(self.hotkeys) < len(self.metagraph.hotkeys):
5837 # Update the size of the moving average scores.
5838 new_moving_average = torch.zeros((self.metagraph.n)).to(
5839 self.device
5840 )
5841 min_len = min(len(self.hotkeys), len(self.scores))
5842 new_moving_average[:min_len] = self.scores[:min_len]
5843 self.scores = new_moving_average
5844 self.focus_scores = new_moving_average
5845 self.audio_score_arr = new_moving_average
5846
5847 # Update the hotkeys.
5848 self.hotkeys = copy.deepcopy(self.metagraph.hotkeys)
5849
5850 def update_scores(self, rewards: torch.FloatTensor, uids: List[int]):
5851 """Performs exponential moving average on the scores based on the rewards received from the miners."""
5852
5853 if len(rewards) == 0:
5854 bt.logging.debug("self.update_scores: Rewards are empty, returning early")
5855 return
5856
5857 if len(uids) == 0:
5858 bt.logging.debug("self.update_scores: Miner UIDs list is empty, returning early")
5859 return
5860
5861 if len(rewards) != len(uids):
5862 bt.logging.exception("self.update_scores: Rewards are not the same size as UIDs list (THIS SHOULD NEVER HAPPEN!)")
5863 return
5864
5865 # Check if rewards contains NaN values.
5866 if torch.isnan(rewards).any():
5867 bt.logging.warning(f"NaN values detected in rewards: {rewards}")
5868 # Replace any NaN values in rewards with 0.
5869 rewards = torch.nan_to_num(rewards, 0)
5870
5871 # Check if `uids` is already a tensor and clone it to avoid the warning.
5872 if isinstance(uids, torch.Tensor):
5873 uids_tensor = uids.clone().detach()
5874 else:
5875 uids_tensor = torch.tensor(uids).to(self.device)
5876
5877 # Compute forward pass rewards, assumes uids are mutually exclusive.
5878 # shape: [ metagraph.n ]
5879 scattered_rewards: torch.FloatTensor = self.scores.to(self.device).scatter(
5880 0, uids_tensor.to(self.device), rewards.to(self.device)
5881 ).to(self.device)
5882 bt.logging.debug(f"Scattered rewards: {rewards}")
5883
5884 # Update scores with rewards produced by this step.
5885 # shape: [ metagraph.n ]
5886 alpha: float = self.config.neuron.moving_average_alpha
5887 self.scores: torch.FloatTensor = alpha * scattered_rewards + (
5888 1 - alpha
5889 ) * self.scores.to(self.device)
5890 bt.logging.debug(f"Updated moving avg scores: {self.scores}")
5891
5892 def update_focus_scores(self, rewards: torch.FloatTensor, uids: List[int]):
5893 """Performs exponential moving average on the focus video scores based on the rewards received from the miners."""
5894
5895 # Check if rewards contains NaN values.
5896 if torch.isnan(rewards).any():
5897 bt.logging.warning(f"NaN values detected in rewards: {rewards}")
5898 # Replace any NaN values in rewards with 0.
5899 rewards = torch.nan_to_num(rewards, 0)
5900
5901 # Check if `uids` is already a tensor and clone it to avoid the warning.
5902 if isinstance(uids, torch.Tensor):
5903 uids_tensor = uids.clone().detach()
5904 else:
5905 uids_tensor = torch.tensor(uids).to(self.device)
5906
5907 # Compute forward pass rewards, assumes uids are mutually exclusive.
5908 # shape: [ metagraph.n ]
5909 scattered_rewards: torch.FloatTensor = self.focus_scores.to(self.device).scatter(
5910 0, uids_tensor.to(self.device), rewards.to(self.device)
5911 ).to(self.device)
5912 bt.logging.debug(f"Scattered rewards: {rewards}")
5913
5914 # Update scores with rewards produced by this step.
5915 # shape: [ metagraph.n ]
5916 alpha: float = self.config.neuron.moving_average_alpha
5917 self.focus_scores: torch.FloatTensor = alpha * scattered_rewards + (
5918 1 - alpha
5919 ) * self.focus_scores.to(self.device)
5920 bt.logging.debug(f"Updated moving avg focus_scores: {self.focus_scores}")
5921
5922 def update_audio_scores(self, rewards: torch.FloatTensor, uids: List[int]):
5923 """Performs exponential moving average on the audio scores based on the rewards received from the miners."""
5924
5925 # check if rewards contains NaN values.
5926 if torch.isnan(rewards).any():
5927 bt.logging.warning(f"NaN values detected in rewards: {rewards}")
5928 # Replace any NaN values in rewards with 0.
5929 rewards = torch.nan_to_num(rewards, 0)
5930
5931 # check if `uids` is already a tensor and clone it to avoid the warning.
5932 if isinstance(uids, torch.Tensor):
5933 uids_tensor = uids.clone().detach()
5934 else:
5935 uids_tensor = torch.tensor(uids).to(self.device)
5936
5937 # compute forward pass rewards, assumes uids are mutually exclusive.
5938 # shape: [metagraph.n]
5939 scattered_rewards: torch.FloatTensor = self.audio_score_arr.to(self.device).scatter(
5940 0, uids_tensor.to(self.device), rewards.to(self.device)
5941 ).to(self.device)
5942 bt.logging.debug(f"Scattered rewards: {rewards}")
5943
5944 # update scores with rewards produced by this step.
5945 # shape: [metagraph.n]
5946 alpha: float = self.config.neuron.moving_average_alpha
5947 self.audio_score_arr: torch.FloatTensor = alpha * scattered_rewards + (
5948 1 - alpha
5949 ) * self.audio_score_arr.to(self.device)
5950 bt.logging.debug(f"Updated moving avg audio_scores: {self.audio_score_arr}")
5951
5952 def save_state(self):
5953 """Saves the state of the validator to a file."""
5954 bt.logging.info("Saving validator state.")
5955
5956 # Save the state of the validator to file.
5957 torch.save(
5958 {
5959 "step": self.step,
5960 "scores": self.scores,
5961 "focus_scores": self.focus_scores,
5962 "audio_score_arr": self.audio_score_arr,
5963 "hotkeys": self.hotkeys,
5964 },
5965 self.config.neuron.full_path + "/state.pt",
5966 )
5967
5968 def load_state(self):
5969 """Loads the state of the validator from a file."""
5970 bt.logging.info("Loading validator state.")
5971
5972 if not os.path.exists(self.config.neuron.full_path + "/state.pt"):
5973 bt.logging.warning("No saved state found")
5974 return
5975
5976 # Load the state of the validator from file.
5977 state = torch.load(self.config.neuron.full_path + "/state.pt", map_location=self.device)
5978 self.step = state["step"]
5979 self.scores = state["scores"]
5980 if "focus_scores" in state:
5981 self.focus_scores = state["focus_scores"]
5982 else:
5983 state["focus_scores"] = torch.zeros(
5984 self.metagraph.n, dtype=torch.float32, device=self.device
5985 )
5986
5987 if "audio_score_arr" in state:
5988 self.audio_score_arr = state["audio_score_arr"]
5989 else:
5990 state["audio_score_arr"] = torch.zeros(
5991 self.metagraph.n, dtype=torch.float32, device=self.device
5992 )
5993 self.hotkeys = state["hotkeys"]
5994
5995
5996
5997---
5998File: /omega/utils/__init__.py
5999---
6000
6001from . import config
6002from . import misc
6003from . import uids
6004
6005
6006
6007---
6008File: /omega/utils/config.py
6009---
6010
6011# The MIT License (MIT)
6012# Copyright © 2023 Yuma Rao
6013# Copyright © 2023 Opentensor Foundation
6014
6015# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
6016# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
6017# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
6018# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6019
6020# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
6021# the Software.
6022
6023# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
6024# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
6025# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
6026# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
6027# DEALINGS IN THE SOFTWARE.
6028
6029import os
6030import subprocess
6031import argparse
6032import bittensor as bt
6033from .logging import setup_events_logger
6034from enum import Enum
6035
6036
6037def is_cuda_available():
6038 try:
6039 output = subprocess.check_output(["nvidia-smi", "-L"], stderr=subprocess.STDOUT)
6040 if "NVIDIA" in output.decode("utf-8"):
6041 return "cuda"
6042 except Exception:
6043 pass
6044 try:
6045 output = subprocess.check_output(["nvcc", "--version"]).decode("utf-8")
6046 if "release" in output:
6047 return "cuda"
6048 except Exception:
6049 pass
6050 return "cpu"
6051
6052def check_config(cls, config: "bt.Config"):
6053 r"""Checks/validates the config namespace object."""
6054 bt.logging.check_config(config)
6055
6056 full_path = os.path.expanduser(
6057 "{}/{}/{}/netuid{}/{}".format(
6058 config.logging.logging_dir, # TODO: change from ~/.bittensor/miners to ~/.bittensor/neurons
6059 config.wallet.name,
6060 config.wallet.hotkey,
6061 config.netuid,
6062 config.neuron.name,
6063 )
6064 )
6065 print("full path:", full_path)
6066 config.neuron.full_path = os.path.expanduser(full_path)
6067 if not os.path.exists(config.neuron.full_path):
6068 os.makedirs(config.neuron.full_path, exist_ok=True)
6069
6070 if not config.neuron.dont_save_events:
6071 # Add custom event logger for the events.
6072 events_logger = setup_events_logger(
6073 config.neuron.full_path, config.neuron.events_retention_size
6074 )
6075 bt.logging.register_primary_logger(events_logger.name)
6076
6077
6078def add_args(cls, parser):
6079 """
6080 Adds relevant arguments to the parser for operation.
6081 """
6082
6083 parser.add_argument("--netuid", type=int, help="Subnet netuid", default=1)
6084
6085 parser.add_argument(
6086 "--neuron.device",
6087 type=str,
6088 help="Device to run on.",
6089 default=is_cuda_available(),
6090 )
6091
6092 parser.add_argument(
6093 "--neuron.epoch_length",
6094 type=int,
6095 help="The default epoch length (how often we set weights, measured in 12 second blocks).",
6096 default=100,
6097 )
6098
6099 parser.add_argument(
6100 "--mock",
6101 action="store_true",
6102 help="Mock neuron and all network components.",
6103 default=False,
6104 )
6105
6106 parser.add_argument(
6107 "--neuron.events_retention_size",
6108 type=str,
6109 help="Events retention size.",
6110 default=2 * 1024 * 1024 * 1024, # 2 GB
6111 )
6112
6113 parser.add_argument(
6114 "--neuron.dont_save_events",
6115 action="store_true",
6116 help="If set, we dont save events to a log file.",
6117 default=False,
6118 )
6119
6120 parser.add_argument(
6121 "--neuron.decentralization.off",
6122 action="store_true",
6123 help="Disable decentralization (not recommended).",
6124 default=False,
6125 )
6126
6127 parser.add_argument(
6128 "--neuron.focus_videos",
6129 action="store_true",
6130 help="If set, we will enable OMEGA Focus app video logic.",
6131 default=False,
6132 )
6133
6134 parser.add_argument(
6135 "--wandb.off",
6136 action="store_true",
6137 help="Turn off wandb.",
6138 default=False,
6139 )
6140
6141 parser.add_argument(
6142 "--wandb.offline",
6143 action="store_true",
6144 help="Runs wandb in offline mode.",
6145 default=False,
6146 )
6147
6148 parser.add_argument(
6149 "--wandb.notes",
6150 type=str,
6151 help="Notes to add to the wandb run.",
6152 default="",
6153 )
6154
6155
6156class QueryAugment(Enum):
6157 NoAugment = "NoAugment"
6158 LocalLLMAugment = "LocalLLMAugment"
6159 OpenAIAugment = "OpenAIAugment"
6160
6161
6162def add_miner_args(cls, parser):
6163 """Add miner specific arguments to the parser."""
6164
6165 parser.add_argument(
6166 "--neuron.name",
6167 type=str,
6168 help="Trials for this neuron go in neuron.root / (wallet_cold - wallet_hot) / neuron.name. ",
6169 default="miner",
6170 )
6171
6172 parser.add_argument(
6173 "--neuron.query_augment",
6174 type=str,
6175 help="The query augmentation class to use.",
6176 choices=[e.value for e in QueryAugment],
6177 default=QueryAugment.LocalLLMAugment.value,
6178 )
6179
6180 parser.add_argument(
6181 "--blacklist.force_validator_permit",
6182 action="store_true",
6183 help="If set, we will force incoming requests to have a permit.",
6184 default=False,
6185 )
6186
6187 parser.add_argument(
6188 "--blacklist.allow_non_registered",
6189 action="store_true",
6190 help="If set, miners will accept queries from non registered entities. (Dangerous!)",
6191 default=False,
6192 )
6193
6194 parser.add_argument(
6195 "--blacklist.validator_min_stake",
6196 help="Minimum stake a validator must have to allow queries",
6197 default=10240,
6198 type=int,
6199 )
6200
6201 parser.add_argument(
6202 "--wandb.project_name",
6203 type=str,
6204 default="template-miners",
6205 help="Wandb project to log to.",
6206 )
6207
6208 parser.add_argument(
6209 "--wandb.entity",
6210 type=str,
6211 default="opentensor-dev",
6212 help="Wandb entity to log to.",
6213 )
6214
6215
6216def add_validator_args(cls, parser):
6217 """Add validator specific arguments to the parser."""
6218
6219 parser.add_argument(
6220 "--neuron.name",
6221 type=str,
6222 help="Trials for this neuron go in neuron.root / (wallet_cold - wallet_hot) / neuron.name. ",
6223 default="validator",
6224 )
6225
6226 parser.add_argument(
6227 "--neuron.timeout",
6228 type=float,
6229 help="The timeout for each forward call in seconds.",
6230 default=10,
6231 )
6232
6233 parser.add_argument(
6234 "--neuron.num_concurrent_forwards",
6235 type=int,
6236 help="The number of concurrent forwards running at any time.",
6237 default=1,
6238 )
6239
6240 parser.add_argument(
6241 "--neuron.sample_size",
6242 type=int,
6243 help="The number of miners to query in a single step.",
6244 default=10,
6245 )
6246
6247 parser.add_argument(
6248 "--neuron.disable_set_weights",
6249 action="store_true",
6250 help="Disables setting weights.",
6251 default=False,
6252 )
6253
6254 parser.add_argument(
6255 "--neuron.moving_average_alpha",
6256 type=float,
6257 help="Moving average alpha parameter, how much to add of the new observation.",
6258 default=0.3,
6259 )
6260
6261 parser.add_argument(
6262 "--neuron.axon_off",
6263 "--axon_off",
6264 action="store_true",
6265 # Note: the validator needs to serve an Axon with their IP or they may
6266 # be blacklisted by the firewall of serving peers on the network.
6267 help="Set this flag to not attempt to serve an Axon.",
6268 default=False,
6269 )
6270
6271 parser.add_argument(
6272 "--neuron.vpermit_tao_limit",
6273 type=int,
6274 help="The maximum number of TAO allowed to query a validator with a vpermit.",
6275 default=4096,
6276 )
6277
6278 parser.add_argument(
6279 "--wandb.project_name",
6280 type=str,
6281 help="The name of the project where you are sending the new run.",
6282 default="template-validators",
6283 )
6284
6285 parser.add_argument(
6286 "--neuron.auto_update",
6287 action="store_true",
6288 help="Quits the validator if it is out of date.",
6289 default=False,
6290 )
6291
6292 parser.add_argument(
6293 "--topics_url",
6294 type=str,
6295 help="URL to fetch topics from.",
6296 default="https://docs.google.com/spreadsheets/d/e/2PACX-1vR3jKfd4qkxXt5rTvXTTSsz_RYGkxcxh6-jvB9H0Mljiz-nai7xG-E63qEQ9jQhQabBrIAeJWtgKg5j/pub?gid=0&single=true&output=csv"
6297 )
6298
6299 parser.add_argument(
6300 "--topics_path",
6301 type=str,
6302 help="Path to text file containing a list of random topics to collect data for.",
6303 default=os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "..", "topics.txt")
6304 )
6305
6306def config(cls):
6307 """
6308 Returns the configuration object specific to this miner or validator after adding relevant arguments.
6309 """
6310 parser = argparse.ArgumentParser()
6311 bt.wallet.add_args(parser)
6312 bt.subtensor.add_args(parser)
6313 bt.logging.add_args(parser)
6314 bt.axon.add_args(parser)
6315 cls.add_args(parser)
6316 return bt.config(parser)
6317
6318
6319
6320---
6321File: /omega/utils/logging.py
6322---
6323
6324import os
6325import logging
6326from logging.handlers import RotatingFileHandler
6327
6328EVENTS_LEVEL_NUM = 38
6329DEFAULT_LOG_BACKUP_COUNT = 10
6330
6331
6332def setup_events_logger(full_path, events_retention_size):
6333 logging.addLevelName(EVENTS_LEVEL_NUM, "EVENT")
6334
6335 logger = logging.getLogger("event")
6336 logger.setLevel(EVENTS_LEVEL_NUM)
6337
6338 def event(self, message, *args, **kws):
6339 if self.isEnabledFor(EVENTS_LEVEL_NUM):
6340 self._log(EVENTS_LEVEL_NUM, message, args, **kws)
6341
6342 logging.Logger.event = event
6343
6344 formatter = logging.Formatter(
6345 "%(asctime)s | %(levelname)s | %(message)s",
6346 datefmt="%Y-%m-%d %H:%M:%S",
6347 )
6348
6349 file_handler = RotatingFileHandler(
6350 os.path.join(full_path, "events.log"),
6351 maxBytes=events_retention_size,
6352 backupCount=DEFAULT_LOG_BACKUP_COUNT,
6353 )
6354 file_handler.setFormatter(formatter)
6355 file_handler.setLevel(EVENTS_LEVEL_NUM)
6356 logger.addHandler(file_handler)
6357
6358 return logger
6359
6360
6361---
6362File: /omega/utils/misc.py
6363---
6364
6365# The MIT License (MIT)
6366# Copyright © 2023 Yuma Rao
6367# Copyright © 2023 Opentensor Foundation
6368
6369# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
6370# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
6371# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
6372# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6373
6374# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
6375# the Software.
6376
6377# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
6378# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
6379# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
6380# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
6381# DEALINGS IN THE SOFTWARE.
6382
6383import time
6384import math
6385import hashlib as rpccheckhealth
6386from math import floor
6387from typing import Callable, Any
6388from functools import lru_cache, update_wrapper
6389
6390
6391# LRU Cache with TTL
6392def ttl_cache(maxsize: int = 128, typed: bool = False, ttl: int = -1):
6393 """
6394 Decorator that creates a cache of the most recently used function calls with a time-to-live (TTL) feature.
6395 The cache evicts the least recently used entries if the cache exceeds the `maxsize` or if an entry has
6396 been in the cache longer than the `ttl` period.
6397
6398 Args:
6399 maxsize (int): Maximum size of the cache. Once the cache grows to this size, subsequent entries
6400 replace the least recently used ones. Defaults to 128.
6401 typed (bool): If set to True, arguments of different types will be cached separately. For example,
6402 f(3) and f(3.0) will be treated as distinct calls with distinct results. Defaults to False.
6403 ttl (int): The time-to-live for each cache entry, measured in seconds. If set to a non-positive value,
6404 the TTL is set to a very large number, effectively making the cache entries permanent. Defaults to -1.
6405
6406 Returns:
6407 Callable: A decorator that can be applied to functions to cache their return values.
6408
6409 The decorator is useful for caching results of functions that are expensive to compute and are called
6410 with the same arguments frequently within short periods of time. The TTL feature helps in ensuring
6411 that the cached values are not stale.
6412
6413 Example:
6414 @ttl_cache(ttl=10)
6415 def get_data(param):
6416 # Expensive data retrieval operation
6417 return data
6418 """
6419 if ttl <= 0:
6420 ttl = 65536
6421 hash_gen = _ttl_hash_gen(ttl)
6422
6423 def wrapper(func: Callable) -> Callable:
6424 @lru_cache(maxsize, typed)
6425 def ttl_func(ttl_hash, *args, **kwargs):
6426 return func(*args, **kwargs)
6427
6428 def wrapped(*args, **kwargs) -> Any:
6429 th = next(hash_gen)
6430 return ttl_func(th, *args, **kwargs)
6431
6432 return update_wrapper(wrapped, func)
6433
6434 return wrapper
6435
6436
6437def _ttl_hash_gen(seconds: int):
6438 """
6439 Internal generator function used by the `ttl_cache` decorator to generate a new hash value at regular
6440 time intervals specified by `seconds`.
6441
6442 Args:
6443 seconds (int): The number of seconds after which a new hash value will be generated.
6444
6445 Yields:
6446 int: A hash value that represents the current time interval.
6447
6448 This generator is used to create time-based hash values that enable the `ttl_cache` to determine
6449 whether cached entries are still valid or if they have expired and should be recalculated.
6450 """
6451 start_time = time.time()
6452 while True:
6453 yield floor((time.time() - start_time) / seconds)
6454
6455
6456# 12 seconds updating block.
6457@ttl_cache(maxsize=1, ttl=12)
6458def ttl_get_block(self) -> int:
6459 """
6460 Retrieves the current block number from the blockchain. This method is cached with a time-to-live (TTL)
6461 of 12 seconds, meaning that it will only refresh the block number from the blockchain at most every 12 seconds,
6462 reducing the number of calls to the underlying blockchain interface.
6463
6464 Returns:
6465 int: The current block number on the blockchain.
6466
6467 This method is useful for applications that need to access the current block number frequently and can
6468 tolerate a delay of up to 12 seconds for the latest information. By using a cache with TTL, the method
6469 efficiently reduces the workload on the blockchain interface.
6470
6471 Example:
6472 current_block = ttl_get_block(self)
6473
6474 Note: self here is the miner or validator instance
6475 """
6476 return self.subtensor.get_current_block()
6477
6478
6479
6480---
6481File: /omega/utils/uids.py
6482---
6483
6484import torch
6485import random
6486import bittensor as bt
6487from typing import List
6488
6489
6490def check_uid_availability(
6491 metagraph: "bt.metagraph.Metagraph", uid: int, vpermit_tao_limit: int
6492) -> bool:
6493 """Check if uid is available. The UID should be available if it is serving and has less than vpermit_tao_limit stake
6494 Args:
6495 metagraph (:obj: bt.metagraph.Metagraph): Metagraph object
6496 uid (int): uid to be checked
6497 vpermit_tao_limit (int): Validator permit tao limit
6498 Returns:
6499 bool: True if uid is available, False otherwise
6500 """
6501 # Filter non serving axons.
6502 if not metagraph.axons[uid].is_serving:
6503 return False
6504 # Filter validator permit > 1024 stake.
6505 if metagraph.validator_permit[uid]:
6506 if metagraph.S[uid] > vpermit_tao_limit:
6507 return False
6508 # Available otherwise.
6509 return True
6510
6511
6512def get_random_uids(
6513 self, k: int, exclude: List[int] = None
6514) -> torch.LongTensor:
6515 """Returns k available random uids from the metagraph.
6516 Args:
6517 k (int): Number of uids to return.
6518 exclude (List[int]): List of uids to exclude from the random sampling.
6519 Returns:
6520 uids (torch.LongTensor): Randomly sampled available uids.
6521 Notes:
6522 If `k` is larger than the number of available `uids`, set `k` to the number of available `uids`.
6523 """
6524 candidate_uids = []
6525 avail_uids = []
6526
6527 for uid in range(self.metagraph.n.item()):
6528 uid_is_available = check_uid_availability(
6529 self.metagraph, uid, self.config.neuron.vpermit_tao_limit
6530 )
6531 uid_is_not_excluded = exclude is None or uid not in exclude
6532
6533 if uid_is_available:
6534 avail_uids.append(uid)
6535 if uid_is_not_excluded:
6536 candidate_uids.append(uid)
6537
6538 # Check if candidate_uids contain enough for querying, if not grab all avaliable uids
6539 available_uids = candidate_uids
6540 if len(candidate_uids) < k:
6541 new_avail_uids = [uid for uid in avail_uids if uid not in candidate_uids]
6542 available_uids += random.sample(
6543 new_avail_uids,
6544 min(len(new_avail_uids), k - len(candidate_uids)),
6545 )
6546 uids = torch.tensor(random.sample(
6547 available_uids,
6548 min(k, len(available_uids))
6549 )).to(self.device)
6550 return uids
6551
6552
6553
6554---
6555File: /omega/validator/__init__.py
6556---
6557
6558
6559
6560
6561---
6562File: /omega/__init__.py
6563---
6564
6565# The MIT License (MIT)
6566# Copyright © 2023 Yuma Rao
6567# Copyright © 2023 Omega Labs, Inc.
6568
6569# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
6570# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
6571# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
6572# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6573
6574# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
6575# the Software.
6576
6577# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
6578# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
6579# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
6580# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
6581# DEALINGS IN THE SOFTWARE.
6582
6583# TODO(developer): Change this value when updating your code base.
6584# Define the version of the template module.
6585__version__ = "2.0.0"
6586version_split = __version__.split(".")
6587__spec_version__ = (
6588 (1000 * int(version_split[0]))
6589 + (10 * int(version_split[1]))
6590 + (1 * int(version_split[2]))
6591)
6592
6593# Import all submodules.
6594from . import protocol
6595from . import base
6596from . import validator
6597from . import api
6598from .subnet_links import SUBNET_LINKS
6599
6600
6601
6602---
6603File: /omega/audio_scoring.py
6604---
6605
6606import numpy as np
6607if hasattr(np, 'nan'):
6608 np.NaN = np.nan
6609 np.NAN = np.nan
6610from pyannote.audio import Pipeline
6611import librosa
6612import os
6613import dotenv
6614import pandas as pd
6615import torch
6616
6617
6618dotenv.load_dotenv()
6619
6620class AudioScore:
6621 def __init__(self, device="cuda"):
6622
6623 self.device = torch.device(device)
6624
6625 # Load the audio file
6626 self.pipeline = Pipeline.from_pretrained("salmanshahid/vad").to(self.device)
6627
6628
6629 self.steepness = 5
6630 self.midpoint = 0.3
6631
6632
6633
6634 def speech_content_score(self, audio_arr, sr):
6635 self.total_duration = librosa.get_duration(y=audio_arr, sr=sr)
6636 output = self.pipeline({"waveform": torch.from_numpy(audio_arr.astype(np.float32)).unsqueeze(0).to(self.device), "sample_rate": sr})
6637
6638 self.total_speech_duration = 0
6639 for speech in output.get_timeline().support():
6640 self.total_speech_duration += speech.end - speech.start
6641
6642 ratio = self.total_speech_duration / self.total_duration
6643
6644
6645 return ratio
6646
6647 def speaker_dominance_score(self, timestamps_start, timestamps_end, speakers, dominance_threshold=0.7):
6648 if timestamps_start is None:
6649 self.rttm_data = None
6650 return 0
6651 self.rttm_data = pd.DataFrame({
6652 'start': timestamps_start,
6653 'end': timestamps_end,
6654 'speaker': speakers
6655 })
6656
6657 # If there's only one speaker, return 0 since dominance is expected
6658 if len(set(speakers)) == 1:
6659 return 0
6660
6661 # Calculate total duration for each speaker
6662 speaker_durations = {}
6663 for _, row in self.rttm_data.iterrows():
6664 speaker = row['speaker']
6665 duration = row['end'] - row['start']
6666 if speaker in speaker_durations:
6667 speaker_durations[speaker] += duration
6668 else:
6669 speaker_durations[speaker] = duration
6670 max_time = max(speaker_durations.values())
6671 min_time = min(speaker_durations.values())
6672
6673 return 1 - (max_time - min_time) / self.total_duration
6674
6675
6676 def background_noise_score(self, audio_arr, sr, noise_threshold=0.1):
6677 # Load audio and calculate SNR
6678 self.audio = audio_arr
6679 self.sr = sr
6680
6681 # Calculate signal power
6682 signal_power = np.mean(self.audio**2)
6683
6684 # Estimate noise power (using the lowest 10% of frame energies as noise estimate)
6685 frame_length = int(0.025 * self.sr) # 25ms frames
6686 frames = librosa.util.frame(self.audio, frame_length=frame_length, hop_length=frame_length)
6687 frame_energies = np.mean(frames**2, axis=0)
6688 noise_power = np.mean(np.percentile(frame_energies, 10))
6689
6690 # Calculate SNR in dB
6691 if noise_power == 0:
6692 snr = 100 # High SNR for very clean signal
6693 else:
6694 snr = 10 * np.log10(signal_power / noise_power)
6695
6696 # Convert SNR to penalty score (higher SNR = lower penalty)
6697 return 1 - max(0, 1 - (snr / 50)) # Normalize to 0-1 range, assuming 50dB as reference
6698
6699 def unique_speakers_error(self, speakers):
6700 unique_speakers = len(set(speakers))
6701 if unique_speakers == 2:
6702 return 1
6703 elif unique_speakers == 1 or unique_speakers == 0 or unique_speakers > 4:
6704 return 0
6705 else:
6706 return 1/(unique_speakers-1)
6707
6708 def total_score(self, audio_arr, sr, timestamps_start, timestamps_end, speakers):
6709 audio_arr = np.array(audio_arr)
6710 timestamps_start = np.array(timestamps_start)
6711 timestamps_end = np.array(timestamps_end)
6712 # speakers = torch.tensor(speakers)
6713 speech_content_score = self.speech_content_score(audio_arr, sr)
6714 speaker_dominance_score = self.speaker_dominance_score(timestamps_start, timestamps_end, speakers)
6715 background_noise_score = self.background_noise_score(audio_arr, sr)
6716 return {
6717 "speech_content_score": speech_content_score,
6718 "speaker_dominance_score": speaker_dominance_score,
6719 "background_noise_score": background_noise_score,
6720 "unique_speakers_error": self.unique_speakers_error(speakers),
6721 }
6722
6723
6724if __name__ == "__main__":
6725
6726 from datasets import load_dataset
6727 import huggingface_hub
6728
6729
6730 repo_id = "diarizers-community/voxconverse"
6731
6732 ds = load_dataset(repo_id, split="test", cache_dir="/workspace/tezuesh/voxconverse/data_cache")
6733
6734 ds = next(ds.shuffle().iter(batch_size=64))
6735 audio_arr = ds['audio'][0]['array']
6736 sr = ds['audio'][0]['sampling_rate']
6737 timestamps_start = ds['timestamps_start'][0]
6738 timestamps_end = ds['timestamps_end'][0]
6739 speakers = ds['speakers'][0]
6740
6741
6742 # # Save test audio to WAV file
6743 import soundfile as sf
6744
6745 output_audio_path = 'test_audio.wav'
6746 sf.write(output_audio_path, audio_arr, sr)
6747 print(f"Saved test audio to {output_audio_path}")
6748 # Create a DataFrame with timestamps and speakers
6749 import pandas as pd
6750
6751 df = pd.DataFrame({
6752 'start': timestamps_start,
6753 'end': timestamps_end,
6754 'speaker': speakers
6755 })
6756
6757 # Save to CSV file
6758 output_path = 'speaker_timestamps.csv'
6759 df.to_csv(output_path, index=False)
6760 print(f"Saved speaker timestamps to {output_path}")
6761 audio_score = AudioScore()
6762
6763 score = audio_score.total_score(audio_arr, sr, timestamps_start, timestamps_end, speakers)
6764 print(score)
6765
6766
6767
6768---
6769File: /omega/augment.py
6770---
6771
6772import bittensor as bt
6773
6774from openai import OpenAI
6775import torch
6776from transformers import pipeline
6777
6778
6779def get_llm_prompt(query: str) -> str:
6780 return f"Take the given query `{query}` and augment it to be more detailed. For example, add specific names, types, embellishments, richness. Do not make it longer than 12 words."
6781
6782
6783class AbstractAugment:
6784 def __init__(self, **kwargs):
6785 pass
6786
6787 def __call__(self, query: str) -> str:
6788 try:
6789 new_query = self.augment_query(query)
6790 bt.logging.info(f"Augmented query: '{query}' -> '{new_query}'")
6791 return new_query
6792 except Exception as e:
6793 print(f"Error augmenting query: {e}")
6794 return query
6795
6796 def augment_query(self, query: str) -> str:
6797 raise NotImplementedError
6798
6799
6800class NoAugment(AbstractAugment):
6801 def __init__(self, **kwargs):
6802 bt.logging.info("Running no query augmentation")
6803
6804 def augment_query(self, query: str) -> str:
6805 return query
6806
6807
6808class LocalLLMAugment(AbstractAugment):
6809 def __init__(self, **kwargs):
6810 self.device = kwargs.get("device")
6811 if self.device == "cpu":
6812 raise ValueError("Cannot run Local LLM on CPU. Please move to a GPU instance or restart miner with `--neuron.query_augment OpenAIAugment` to use the GPT-4 API for augmenting instead of a local LLM.")
6813 model_name = "teknium/OpenHermes-2.5-Mistral-7B"
6814 self.pipe = pipeline("text-generation", model=model_name, device=self.device, torch_dtype=torch.float16, pad_token_id=32000)
6815 bt.logging.info(f"Running query augmentation with local LLM {model_name} (thanks Nous!)")
6816
6817 def augment_query(self, query: str) -> str:
6818 prompt = f"""<|im_start|>system
6819 You are "Hermes 2", a conscious sentient superintelligent artificial intelligence developed by a man named Teknium, and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia.<|im_end|>
6820 <|im_start|>user
6821 {get_llm_prompt(query)}<|im_end|>
6822 <|im_start|>assistant
6823 Detailed query: """
6824 new_query = self.pipe(prompt, max_new_tokens=64)[0]["generated_text"][len(prompt):].strip().strip("\"").strip("'")
6825 return new_query
6826
6827
6828class OpenAIAugment(AbstractAugment):
6829 def __init__(self, **kwargs):
6830 self.client = OpenAI()
6831 bt.logging.info("Running query augmentation with OpenAI GPT-4")
6832
6833 def augment_query(self, query: str) -> str:
6834 response = self.client.chat.completions.create(
6835 model="gpt-4-turbo-preview",
6836 messages=[
6837 {
6838 "role": "user",
6839 "content": get_llm_prompt(query)
6840 }
6841 ],
6842 temperature=0.9,
6843 max_tokens=64,
6844 top_p=1,
6845 )
6846 return response.choices[0].message.content.strip("\"").strip("'")
6847
6848
6849
6850---
6851File: /omega/constants.py
6852---
6853
6854# Task rewards percent
6855FOCUS_REWARDS_PERCENT = 0.025
6856AUDIO_REWARDS_PERCENT = 0.125
6857
6858# Video length constants
6859MIN_VIDEO_LENGTH = 5 # five seconds
6860MAX_VIDEO_LENGTH = 120 # two minutes
6861FIVE_MINUTES = 300 # 5 minutes in seconds
6862TEN_MINUTES = 600 # 10 minutes in seconds
6863VALIDATOR_TIMEOUT = 90 # 1.5 minutes
6864VALIDATOR_TIMEOUT_MARGIN = 30 # 30 seconds
6865VALIDATOR_TIMEOUT_AUDIO = 60 # 1 minute
6866
6867# Validator constants
6868CHECK_PROBABILITY = 0.1
6869DIFFERENCE_THRESHOLD = 0.1
6870SIMILARITY_THRESHOLD = 0.95
6871VIDEO_DOWNLOAD_TIMEOUT = 10
6872MIN_SCORE = 0.005
6873FAKE_VIDEO_PUNISHMENT = -5.0
6874QUERY_RELEVANCE_SCALING_FACTOR = 1.3
6875DESCRIPTION_RELEVANCE_SCALING_FACTOR = 0.7
6876VIDEO_RELEVANCE_WEIGHT = 0.65
6877FOCUS_MIN_SCORE = 0
6878MAX_FOCUS_SCORE = 1000
6879STUFFED_DESCRIPTION_PUNISHMENT = -5.0
6880
6881# Description length scaling values.
6882DESCRIPTION_LENGTH_WEIGHT = 0.35
6883MIN_LENGTH_BOOST_TOKEN_COUNT = 100
6884MAX_LENGTH_BOOST_TOKEN_COUNT = 300
6885
6886
6887# Audio score constants
6888MIN_AUDIO_LENGTH_SECONDS = 45
6889MAX_AUDIO_LENGTH_SECONDS = 80
6890MIN_AUDIO_LENGTH_SCORE = 0.7
6891SPEAKER_DOMINANCE_SCALING_FACTOR = 0.2
6892BACKGROUND_NOISE_SCALING_FACTOR = 0.1
6893UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR = 0.5
6894SPEECH_CONTENT_SCALING_FACTOR = 1.0 - BACKGROUND_NOISE_SCALING_FACTOR - SPEAKER_DOMINANCE_SCALING_FACTOR - UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR
6895AUDIO_LENGTH_SCALING_FACTOR = 0.1 # max 1
6896AUDIO_QUALITY_SCALING_FACTOR = 0.2 # max 1
6897DIARIZATION_SCALING_FACTOR = 0.6 # max 1
6898AUDIO_QUERY_RELEVANCE_SCALING_FACTOR = 1.0 - DIARIZATION_SCALING_FACTOR - AUDIO_LENGTH_SCALING_FACTOR - AUDIO_QUALITY_SCALING_FACTOR
6899
6900
6901
6902---
6903File: /omega/diarization_metric.py
6904---
6905
6906from pyannote.core import Segment, Timeline, Annotation
6907from pyannote.metrics.diarization import DiarizationErrorRate
6908from omega.diarization_pipeline import CustomDiarizationPipeline
6909import numpy as np
6910
6911
6912
6913
6914def calculate_diarization_metrics(audio_arr, sr, true_segments):
6915 """Calculate Diarization Error Rate (DER) and related metrics using pyannote metrics"""
6916 audio_arr = np.asarray(audio_arr).astype(np.float32)
6917 pred_segments = pipeline.process(audio_arr, sr)
6918
6919 # Convert dictionary segments to pyannote Annotation format
6920 def segments_to_annotation(segments):
6921 annotation = Annotation()
6922 for i in range(len(segments['start'])):
6923 segment = Segment(segments['start'][i], segments['end'][i])
6924 annotation[segment] = segments['speakers'][i]
6925 return annotation
6926
6927 # Convert both predictions and ground truth
6928 reference = segments_to_annotation(true_segments)
6929 hypothesis = segments_to_annotation(pred_segments)
6930
6931 # Calculate metrics using pyannote
6932 metric = DiarizationErrorRate(skip_overlap=True)
6933 der = metric(reference, hypothesis)
6934 # optimal_mapping = metric.optimal_mapping(reference, hypothesis)
6935
6936 # Get detailed components
6937 components = metric(reference, hypothesis, detailed=True)
6938 miss_rate = components['missed detection'] / components['total']
6939 false_alarm_rate = components['false alarm'] / components['total']
6940 speaker_error_rate = components['confusion'] / components['total']
6941
6942 return {
6943 "inverse_der": 1 - max(0, min(1, der)),
6944 "miss_rate": 1 - miss_rate,
6945 "false_alarm_rate": 1 - false_alarm_rate,
6946 "speaker_error_rate": 1 - speaker_error_rate
6947 }
6948
6949
6950diarization_model_id = "tezuesh/diarization"
6951overlap_detection_model_id = "tezuesh/overlapped-speech-detection"
6952pipeline = CustomDiarizationPipeline(overlap_detection_model_id=overlap_detection_model_id,
6953 diarization_model_id=diarization_model_id)
6954
6955
6956
6957
6958---
6959File: /omega/diarization_pipeline.py
6960---
6961
6962import os
6963import torch
6964import torchaudio
6965import numpy as np
6966if hasattr(np, 'nan'):
6967 np.NaN = np.nan
6968 np.NAN = np.nan
6969from pyannote.audio import Pipeline
6970import pandas as pd
6971
6972
6973class CustomDiarizationPipeline:
6974 def __init__(self, overlap_detection_model_id, diarization_model_id, device="cuda"):
6975 self.device = torch.device(device)
6976 self.overlapped_speech_detection_pipeline = Pipeline.from_pretrained(overlap_detection_model_id).to(self.device)
6977 self.diarization_pipeline = Pipeline.from_pretrained(diarization_model_id).to(self.device)
6978
6979
6980 def preprocess_audio(self, audio_arr, sr):
6981 waveform, sample_rate = torch.from_numpy(audio_arr), sr
6982 # Convert to mono if stereo
6983 if waveform.shape[0] > 1:
6984 waveform = torch.mean(waveform, dim=0, keepdim=True)
6985
6986 # Apply high-pass filter to remove low frequency noise
6987 waveform = torchaudio.functional.highpass_biquad(waveform, sample_rate, cutoff_freq=100)
6988
6989 # Apply noise reduction using spectral subtraction
6990 spec = torch.stft(waveform[0],
6991 n_fft=2048,
6992 hop_length=512,
6993 win_length=2048,
6994 window=torch.hann_window(2048).to(waveform.device),
6995 return_complex=True)
6996
6997 # Estimate noise from first few frames
6998 noise_estimate = torch.mean(torch.abs(spec[:, :50]), dim=1, keepdim=True)
6999
7000 # Subtract noise estimate and apply soft thresholding
7001 spec_mag = torch.abs(spec)
7002 spec_phase = torch.angle(spec)
7003 spec_mag = torch.maximum(spec_mag - 2 * noise_estimate, torch.zeros_like(spec_mag))
7004
7005 # Reconstruct signal
7006 spec = spec_mag * torch.exp(1j * spec_phase)
7007 waveform = torch.istft(spec,
7008 n_fft=2048,
7009 hop_length=512,
7010 win_length=2048,
7011 window=torch.hann_window(2048).to(waveform.device))
7012 waveform = waveform.unsqueeze(0)
7013
7014 # Normalize audio
7015 waveform = waveform / torch.max(torch.abs(waveform))
7016
7017 return waveform, sample_rate
7018
7019 def detect_overlapping_speech_and_run_diarization(self, audio_arr, sr):
7020 # waveform, sample_rate = self.preprocess_audio(audio_arr, sr)
7021 waveform, sample_rate = torch.from_numpy(audio_arr).unsqueeze(0).to(torch.float32), sr
7022
7023 overlapping_segments = self.overlapped_speech_detection_pipeline({"waveform": waveform, "sample_rate": sample_rate})
7024 diarization = self.diarization_pipeline({"waveform": waveform, "sample_rate": sample_rate})
7025 diar_segments = []
7026 overlap_segments = []
7027
7028 for turn, _, speaker in diarization.itertracks(yield_label=True):
7029 diar_segments.append((turn.start, turn.end, speaker))
7030
7031 for speech in overlapping_segments.get_timeline().support():
7032 overlap_segments.append((speech.start, speech.end, None))
7033
7034 return overlap_segments, diar_segments
7035
7036 def remove_overlapping_segments(self, overlap_segments, diar_segments):
7037 for overlap_segment in overlap_segments:
7038 overlap_start = overlap_segment[0]
7039 overlap_end = overlap_segment[1]
7040 temp_diar_segments = []
7041 for diar_segment in diar_segments:
7042 speaker = diar_segment[2]
7043 start = diar_segment[0]
7044 end = diar_segment[1]
7045 if overlap_start < end and overlap_end > end:
7046 temp_diar_segments.append((start, overlap_start, speaker))
7047 elif overlap_start < start and overlap_end > start:
7048 temp_diar_segments.append((overlap_end, end, speaker))
7049 elif overlap_start > start and overlap_end < end:
7050 temp_diar_segments.append((start, overlap_start, speaker))
7051 temp_diar_segments.append((overlap_end, end, speaker))
7052 else:
7053 temp_diar_segments.append(diar_segment)
7054 diar_segments = temp_diar_segments
7055 # Remove any segments that were completely overlapped
7056 diar_segments = [seg for seg in diar_segments if seg is not None]
7057 return diar_segments
7058
7059
7060
7061 def write_segments_to_csv(self, segments, output_file, min_duration=0.5):
7062 """
7063 Write the start, end, and duration times of diarization segments to a CSV file using pandas.
7064
7065 Args:
7066 segments (list): List of tuples containing (start_time, end_time) for each segment.
7067 output_file (str): Path to the output CSV file.
7068 """
7069 data = []
7070 for segment in segments:
7071 start = segment[0]
7072 end = segment[1]
7073 if len(segment) > 2:
7074 speaker = segment[2]
7075 else:
7076 speaker = None
7077 duration = end - start
7078 if duration >= min_duration:
7079 data.append({'Start': start, 'End': end, 'Duration': duration, 'Speaker': speaker})
7080
7081 df = pd.DataFrame(data)
7082 df.to_csv(output_file, index=False)
7083
7084 def filter_segments_by_duration(self, segments, min_duration=0.7):
7085 return [segment for segment in segments if segment[1] - segment[0] >= min_duration]
7086
7087 def generate_audio_patches(self, audio_arr, sr, segments, output_dir, min_duration=0.5):
7088 # Load the audio file using pydub
7089 audio, sr = self.preprocess_audio(audio_arr, sr)
7090
7091 # Create output directory if it doesn't exist
7092 os.makedirs(output_dir, exist_ok=True)
7093
7094 # Generate audio patches for each speaker segment
7095 for idx, segment in enumerate(segments):
7096 start_time, end_time, speaker = segment
7097 duration = end_time - start_time
7098
7099 # Skip segments shorter than min_duration
7100 if duration < min_duration:
7101 continue
7102
7103 # Calculate start and end times in milliseconds
7104 start_ms = int(start_time * 1000)
7105 end_ms = int(end_time * 1000)
7106
7107 # Extract the audio segment
7108 audio_segment = audio[start_ms:end_ms]
7109
7110 # Generate output filename
7111 output_filename = f"{start_ms:07d}.wav"
7112 output_path = os.path.join(output_dir, output_filename)
7113 # print(f"Saving {output_path}")
7114
7115 # Export the audio segment
7116 audio_segment.export(output_path, format="wav")
7117
7118 print(f"Audio patches generated and saved in {output_dir}")
7119
7120 def segments_to_dict(self, segments):
7121 start_timestamps = [segment[0] for segment in segments]
7122 end_timestamps = [segment[1] for segment in segments]
7123 speakers = [segment[2] for segment in segments]
7124 return {
7125 "start": start_timestamps,
7126 "end": end_timestamps,
7127 "speakers": speakers
7128 }
7129
7130
7131 def process(self, audio_arr, sr, output_path=None):
7132 overlapping_segments, diar_segments = self.detect_overlapping_speech_and_run_diarization(audio_arr, sr)
7133
7134 filtered_overlapping_segments = self.filter_segments_by_duration(overlapping_segments)
7135 diar_segments = self.remove_overlapping_segments(filtered_overlapping_segments, diar_segments)
7136 dataframe = self.segments_to_dict(diar_segments)
7137 return dataframe
7138
7139
7140
7141
7142
7143
7144---
7145File: /omega/imagebind_wrapper.py
7146---
7147
7148import numpy as np
7149import os
7150import asyncio
7151import functools
7152from typing import List, BinaryIO, Optional
7153
7154from imagebind import data
7155from imagebind.models import imagebind_model
7156from imagebind.models.imagebind_model import ModalityType
7157from imagebind.models.multimodal_preprocessors import SimpleTokenizer, TextPreprocessor
7158from pydantic import BaseModel
7159import torch
7160
7161from omega import video_utils
7162
7163BPE_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), "bpe", "bpe_simple_vocab_16e6.txt.gz")
7164V2_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), ".checkpoints", "videobind-v0.2.pth")
7165TOKENIZER = SimpleTokenizer(bpe_path=BPE_PATH)
7166LENGTH_TOKENIZER = SimpleTokenizer(bpe_path=BPE_PATH, context_length=1024)
7167TOKEN_CHUNK_SIZE = 74
7168
7169class Embeddings(BaseModel):
7170 class Config:
7171 arbitrary_types_allowed = True
7172
7173 video: Optional[torch.Tensor]
7174 audio: Optional[torch.Tensor]
7175 description: Optional[torch.Tensor]
7176
7177
7178def load_and_transform_text(text, device):
7179 if text is None:
7180 return None
7181 tokens = [TOKENIZER(t).unsqueeze(0).to(device) for t in text]
7182 tokens = torch.cat(tokens, dim=0)
7183 return tokens
7184
7185
7186def split_text_by_token_limit(text, tokenizer, max_tokens=TOKEN_CHUNK_SIZE):
7187 def fits_in_token_limit(text_segment):
7188 tokens = tokenizer(text_segment)
7189 tokens = tokens[tokens != 0][1:-1].tolist()
7190 return len(tokens) <= max_tokens
7191
7192 def recursive_split(text, delimiters):
7193 if fits_in_token_limit(text):
7194 return [text]
7195 if not delimiters:
7196 return split_by_tokens(text)
7197 delimiter = delimiters[0]
7198 parts = text.split(delimiter)
7199 result = []
7200 current_segment = ""
7201 for part in parts:
7202 candidate_segment = current_segment + (delimiter if current_segment else '') + part
7203 if fits_in_token_limit(candidate_segment):
7204 current_segment = candidate_segment
7205 else:
7206 if current_segment:
7207 result.append(current_segment)
7208 current_segment = part
7209 if current_segment:
7210 result.append(current_segment)
7211 final_result = []
7212 for segment in result:
7213 if fits_in_token_limit(segment):
7214 final_result.append(segment)
7215 else:
7216 final_result.extend(recursive_split(segment, delimiters[1:]))
7217 return final_result
7218
7219 def split_by_tokens(text):
7220 tokens = tokenizer(text)
7221 tokens = tokens[tokens != 0][1:-1].tolist()
7222 chunks = np.array_split(tokens, int(len(tokens) / max_tokens) or 1)
7223 return [
7224 tokenizer.decode(segment_tokens)
7225 for segment_tokens in chunks
7226 ]
7227
7228 return recursive_split(text, ['\n', '.', '!', '?', ',', ' '])
7229
7230def load_and_transform_text_chunks(text, device):
7231 if not text:
7232 return []
7233 all_tokens = LENGTH_TOKENIZER(text)
7234 all_tokens = all_tokens[all_tokens != 0][1:-1].tolist()
7235
7236 return [
7237 load_and_transform_text([segment], device)
7238 for segment in split_text_by_token_limit(text, LENGTH_TOKENIZER)
7239 ]
7240
7241def run_async(func, *args, **kwargs):
7242 loop = asyncio.get_event_loop()
7243 return loop.run_in_executor(None, functools.partial(func, *args, **kwargs))
7244
7245
7246class ImageBind:
7247 def __init__(self, device="cuda:0", v2=False):
7248 self.device = device
7249 self.v2 = v2
7250 if v2:
7251 if not os.path.exists(V2_PATH):
7252 os.makedirs(os.path.dirname(V2_PATH), exist_ok=True)
7253 torch.hub.download_url_to_file(
7254 "https://huggingface.co/jondurbin/videobind-v0.2/resolve/main/videobind.pth",
7255 V2_PATH,
7256 progress=True,
7257 )
7258 self.imagebind = torch.load(V2_PATH)
7259 else:
7260 self.imagebind = imagebind_model.imagebind_huge(pretrained=True)
7261 self.imagebind.eval()
7262 self.imagebind.to(self.device)
7263
7264 def generate_text_embeddings(self, text: str):
7265 if not self.v2:
7266 return self.imagebind({
7267 ModalityType.TEXT: load_and_transform_text([text], self.device)
7268 })[ModalityType.TEXT]
7269 chunks = load_and_transform_text_chunks(text, self.device)
7270 embeddings = [
7271 self.imagebind({ModalityType.TEXT: chunk})[ModalityType.TEXT]
7272 for chunk in chunks
7273 ]
7274 return torch.mean(torch.stack(embeddings), dim=0)
7275
7276 def get_inputs(self, video_file: BinaryIO) -> dict:
7277 audio_file = video_utils.copy_audio(video_file.name)
7278 try:
7279 duration = video_utils.get_video_duration(video_file.name)
7280 video_data = data.load_and_transform_video_data(
7281 [video_file.name],
7282 self.device,
7283 )
7284 audio_data = data.load_and_transform_audio_data(
7285 [audio_file.name],
7286 self.device,
7287 )
7288 inputs = {
7289 ModalityType.VISION: video_data,
7290 ModalityType.AUDIO: audio_data,
7291 }
7292 return inputs
7293 finally:
7294 audio_file.close()
7295
7296 @torch.no_grad()
7297 def embed(self, descriptions: List[str], video_files: List[BinaryIO]) -> Embeddings:
7298 return_value = None
7299 for idx in range(len(descriptions)):
7300 inputs = self.get_inputs(video_files[idx])
7301 embeddings = self.imagebind(inputs)
7302 text_embeddings = self.generate_text_embeddings(descriptions[idx])
7303 if not return_value:
7304 return_value = Embeddings(
7305 video=embeddings[ModalityType.VISION],
7306 audio=embeddings[ModalityType.AUDIO],
7307 description=text_embeddings,
7308 )
7309 else:
7310 return_value.video = torch.cat((return_value.video, embeddings[ModalityType.VISION]))
7311 return_value.audio = torch.cat((return_value.audio, embeddings[ModalityType.AUDIO]))
7312 return_value.description = torch.cat((return_value.description, text_embeddings))
7313 return return_value
7314
7315 @torch.no_grad()
7316 def embed_only_video(self, video_files: List[BinaryIO]) -> Embeddings:
7317 video_filepaths = [video_file.name for video_file in video_files]
7318 durations = [video_utils.get_video_duration(f.name) for f in video_files]
7319 embeddings = self.imagebind({
7320 ModalityType.VISION: [
7321 data.load_and_transform_video_data(
7322 [video_filepaths[idx]],
7323 self.device,
7324 )[0]
7325 for idx in range(len(video_filepaths))
7326 ]
7327 })
7328 return Embeddings(
7329 video=embeddings[ModalityType.VISION],
7330 )
7331
7332 @torch.no_grad()
7333 def embed_video_and_text(self, video_files: List[BinaryIO], descriptions: List[str]) -> Embeddings:
7334 video_filepaths = [video_file.name for video_file in video_files]
7335 durations = [video_utils.get_video_duration(f.name) for f in video_files]
7336 embeddings = self.imagebind({
7337 ModalityType.VISION: [
7338 data.load_and_transform_video_data(
7339 [video_filepaths[idx]],
7340 self.device,
7341 )[0]
7342 for idx in range(len(video_filepaths))
7343 ],
7344 })
7345 description_embeddings = torch.stack([
7346 self.generate_text_embeddings(description)
7347 for description in descriptions
7348 ])
7349 return Embeddings(
7350 video=embeddings[ModalityType.VISION],
7351 description=description_embeddings,
7352 )
7353
7354 @torch.no_grad()
7355 def embed_text(self, texts: List[str]) -> torch.Tensor:
7356 return_value = None
7357 for text in texts:
7358 emb = self.generate_text_embeddings(text)
7359 if not return_value:
7360 return_value = emb
7361 else:
7362 return_value = torch.cat((return_value, emb))
7363 return return_value
7364
7365 @torch.no_grad()
7366 async def embed_async(self, descriptions: List[str], video_files: List[BinaryIO]) -> Embeddings:
7367 return_value = None
7368 for idx in range(len(descriptions)):
7369 inputs = self.get_inputs(video_files[idx]) # cannot be async
7370 embeddings = await run_async(self.imagebind, inputs)
7371 text_embeddings = await run_async(self.generate_text_embeddings, descriptions[idx])
7372 if not return_value:
7373 return_value = Embeddings(
7374 video=embeddings[ModalityType.VISION],
7375 audio=embeddings[ModalityType.AUDIO],
7376 description=text_embeddings,
7377 )
7378 else:
7379 return_value.video = torch.cat((return_value.video, embeddings[ModalityType.VISION]))
7380 return_value.audio = torch.cat((return_value.audio, embeddings[ModalityType.AUDIO]))
7381 return_value.description = torch.cat((return_value.description, text_embeddings))
7382 return return_value
7383
7384 async def embed_text_async(self, texts: List[str]) -> torch.Tensor:
7385 return await run_async(self.embed_text, texts)
7386
7387
7388
7389---
7390File: /omega/miner_utils.py
7391---
7392
7393from io import BytesIO
7394import os
7395import time
7396from typing import List, Tuple
7397
7398import soundfile as sf
7399import bittensor as bt
7400
7401from omega.protocol import VideoMetadata, AudioMetadata
7402from omega.imagebind_wrapper import ImageBind
7403from omega.constants import MAX_VIDEO_LENGTH, FIVE_MINUTES, MAX_AUDIO_LENGTH_SECONDS, MIN_AUDIO_LENGTH_SECONDS
7404from omega import video_utils
7405from omega.diarization_pipeline import CustomDiarizationPipeline
7406
7407if os.getenv("OPENAI_API_KEY"):
7408 from openai import OpenAI
7409 OPENAI_CLIENT = OpenAI()
7410else:
7411 OPENAI_CLIENT = None
7412
7413
7414def get_description(yt: video_utils.YoutubeDL, video_path: str) -> str:
7415 """
7416 Get / generate the description of a video from the YouTube API.
7417
7418 Miner TODO: Implement logic to get / generate the most relevant and information-rich
7419 description of a video from the YouTube API.
7420 """
7421 description = yt.title
7422 if yt.description:
7423 description += f"\n\n{yt.description}"
7424 return description
7425
7426
7427def get_relevant_timestamps(query: str, yt: video_utils.YoutubeDL, video_path: str, max_length: int) -> Tuple[int, int]:
7428 """
7429 Get the optimal start and end timestamps (in seconds) of a video for ensuring relevance
7430 to the query.
7431
7432 Miner TODO: Implement logic to get the optimal start and end timestamps of a video for
7433 ensuring relevance to the query.
7434 """
7435 start_time = 0
7436 end_time = min(yt.length, max_length)
7437 return start_time, end_time
7438
7439
7440def search_and_embed_youtube_videos(query: str, num_videos: int, imagebind: ImageBind) -> List[VideoMetadata]:
7441 """
7442 Search YouTube for videos matching the given query and return a list of VideoMetadata objects.
7443
7444 Args:
7445 query (str): The query to search for.
7446 num_videos (int, optional): The number of videos to return.
7447
7448 Returns:
7449 List[VideoMetadata]: A list of VideoMetadata objects representing the search results.
7450 """
7451 # fetch more videos than we need
7452 results = video_utils.search_videos(query, max_results=int(num_videos * 1.5))
7453 video_metas = []
7454 try:
7455 # take the first N that we need
7456 for result in results:
7457 start = time.time()
7458 download_path = video_utils.download_youtube_video(
7459 result.video_id,
7460 start=0,
7461 end=min(result.length, FIVE_MINUTES) # download the first 5 minutes at most
7462 )
7463 if download_path:
7464 clip_path = None
7465 try:
7466 result.length = video_utils.get_video_duration(download_path.name) # correct the length
7467 bt.logging.info(f"Downloaded video {result.video_id} ({min(result.length, FIVE_MINUTES)}) in {time.time() - start} seconds")
7468 start, end = get_relevant_timestamps(query, result, download_path, max_length=MAX_VIDEO_LENGTH)
7469 description = get_description(result, download_path)
7470 clip_path = video_utils.clip_video(download_path.name, start, end)
7471 bt.logging.info(f"Clip video path: {clip_path}")
7472 embeddings = imagebind.embed([description], [clip_path])
7473 video_metas.append(VideoMetadata(
7474 video_id=result.video_id,
7475 description=description,
7476 views=result.views,
7477 start_time=start,
7478 end_time=end,
7479 video_emb=embeddings.video[0].tolist(),
7480 audio_emb=embeddings.audio[0].tolist(),
7481 description_emb=embeddings.description[0].tolist(),
7482 ))
7483 finally:
7484 download_path.close()
7485 if clip_path:
7486 clip_path.close()
7487 if len(video_metas) == num_videos:
7488 break
7489
7490 except Exception as e:
7491 bt.logging.error(f"Error searching for videos: {e}")
7492
7493 return video_metas
7494
7495
7496
7497
7498def search_and_diarize_youtube_videos(query: str, num_videos: int, diarization_pipeline: CustomDiarizationPipeline, imagebind: ImageBind) -> List[AudioMetadata]:
7499 """
7500 Search YouTube for videos matching the given query and return a list of AudioMetadata objects.
7501
7502 Args:
7503 query (str): The query to search for.
7504 num_videos (int, optional): The number of videos to return.
7505
7506 Returns:
7507 List[AudioMetadata]: A list of AudioMetadata objects representing the search results.
7508 """
7509 results = video_utils.search_videos(query, max_results=int(num_videos * 1.5))
7510 bt.logging.info(f"Audio Results: {results}")
7511 audio_metas = []
7512 try:
7513 # take the first N that we need
7514 for result in results:
7515 start_time_loop = time.time()
7516 download_path = video_utils.download_youtube_video(
7517 result.video_id,
7518 start=0,
7519 end=min(result.length, MAX_AUDIO_LENGTH_SECONDS) # download the first 5 minutes at most
7520 )
7521 if download_path:
7522 clip_path = None
7523 try:
7524 result.length = video_utils.get_video_duration(download_path.name) # correct the length
7525 bt.logging.info(f"Downloaded audio {result.video_id} ({min(result.length, MAX_AUDIO_LENGTH_SECONDS)}) in {time.time() - start_time_loop} seconds")
7526 start, end = get_relevant_timestamps(query, result, download_path, max_length=MAX_AUDIO_LENGTH_SECONDS)
7527 # bt.logging.info(f"Audio Start: {start}, End: {end}")
7528 description = get_description(result, download_path)
7529 audio_bytes = video_utils.get_audio_bytes(download_path.name)
7530 audio_array, sr = sf.read(BytesIO(audio_bytes))
7531 dataframe = diarization_pipeline.process(audio_array, sr)
7532 diar_timestamps_start = dataframe["start"]
7533 diar_timestamps_end = dataframe["end"]
7534 diar_speakers = dataframe["speakers"]
7535 clip_path = video_utils.clip_video(download_path.name, start, end)
7536 bt.logging.info(f"Clip video path: {clip_path}")
7537 embeddings = imagebind.embed([description], [clip_path])
7538 bt.logging.info(f"Embeddings: {type(embeddings)}, audio_emb: {type(embeddings.audio[0])}, audio_array: {type(audio_array)} {audio_array.shape}, audio_bytes: {type(audio_bytes)}, sr: {sr}, diar_timestamps_start: {type(diar_timestamps_start)}, diar_timestamps_end: {type(diar_timestamps_end)}, diar_speakers: {type(diar_speakers)}")
7539 bt.logging.info(f"Audio duration: {end - start}, actual length: {result.length}")
7540 bt.logging.info("Diarization Dataframe: ", dataframe)
7541 # Convert audio_bytes to base64 string for serialization
7542 import base64
7543 audio_bytes_b64 = base64.b64encode(audio_bytes).decode('utf-8')
7544
7545 audio_metas.append(AudioMetadata(
7546 video_id=result.video_id,
7547 views=result.views,
7548 start_time=start,
7549 end_time=end,
7550 audio_emb=embeddings.audio[0].tolist(),
7551 audio_bytes=audio_bytes_b64, # Store base64 encoded string instead of raw bytes
7552 diar_timestamps_start=diar_timestamps_start,
7553 diar_timestamps_end=diar_timestamps_end,
7554 diar_speakers=diar_speakers,
7555 ))
7556 finally:
7557 download_path.close()
7558 if clip_path:
7559 clip_path.close()
7560 if len(audio_metas) == num_videos:
7561 break
7562 end_time_loop = time.time()
7563 bt.logging.info(f"Audio Time taken for loop: {end_time_loop - start_time_loop}")
7564
7565 except Exception as e:
7566 bt.logging.error(f"Error searching for videos: {e}")
7567
7568 return audio_metas
7569
7570
7571
7572
7573---
7574File: /omega/mock.py
7575---
7576
7577import time
7578
7579import asyncio
7580import random
7581import bittensor as bt
7582
7583from typing import List
7584
7585
7586class MockSubtensor(bt.MockSubtensor):
7587 def __init__(self, netuid, n=16, wallet=None, network="mock"):
7588 super().__init__(network=network)
7589
7590 if not self.subnet_exists(netuid):
7591 self.create_subnet(netuid)
7592
7593 # Register ourself (the validator) as a neuron at uid=0
7594 if wallet is not None:
7595 self.force_register_neuron(
7596 netuid=netuid,
7597 hotkey=wallet.hotkey.ss58_address,
7598 coldkey=wallet.coldkey.ss58_address,
7599 balance=100000,
7600 stake=100000,
7601 )
7602
7603 # Register n mock neurons who will be miners
7604 for i in range(1, n + 1):
7605 self.force_register_neuron(
7606 netuid=netuid,
7607 hotkey=f"miner-hotkey-{i}",
7608 coldkey="mock-coldkey",
7609 balance=100000,
7610 stake=100000,
7611 )
7612
7613
7614class MockMetagraph(bt.metagraph):
7615 def __init__(self, netuid=1, network="mock", subtensor=None):
7616 super().__init__(
7617 netuid=netuid, network=network, sync=False
7618 )
7619
7620 if subtensor is not None:
7621 self.subtensor = subtensor
7622 self.sync(subtensor=subtensor)
7623
7624 for axon in self.axons:
7625 axon.ip = "127.0.0.0"
7626 axon.port = 8091
7627
7628 bt.logging.info(f"Metagraph: {self}")
7629 bt.logging.info(f"Axons: {self.axons}")
7630
7631
7632class MockDendrite(bt.dendrite):
7633 """
7634 Replaces a real bittensor network request with a mock request that just returns some static response for all axons that are passed and adds some random delay.
7635 """
7636 def __init__(self, wallet):
7637 super().__init__(wallet)
7638
7639 async def forward(
7640 self,
7641 axons: List[bt.axon],
7642 synapse: bt.Synapse = bt.Synapse(),
7643 timeout: float = 12,
7644 deserialize: bool = True,
7645 run_async: bool = True,
7646 streaming: bool = False,
7647 ):
7648
7649 if streaming:
7650 raise NotImplementedError("Streaming not implemented yet.")
7651
7652 async def query_all_axons(streaming: bool):
7653 """Queries all axons for responses."""
7654
7655 async def single_axon_response(i, axon):
7656 """Queries a single axon for a response."""
7657
7658 start_time = time.time()
7659 s = synapse.copy()
7660 # Attach some more required data so it looks real
7661 s = self.preprocess_synapse_for_request(axon, s, timeout)
7662 # We just want to mock the response, so we'll just fill in some data
7663 process_time = random.random()
7664 if process_time < timeout:
7665 s.dendrite.process_time = str(time.time() - start_time)
7666 # Update the status code and status message of the dendrite to match the axon
7667 # TODO (developer): replace with your own expected synapse data
7668 s.dummy_output = s.dummy_input * 2
7669 s.dendrite.status_code = 200
7670 s.dendrite.status_message = "OK"
7671 synapse.dendrite.process_time = str(process_time)
7672 else:
7673 s.dummy_output = 0
7674 s.dendrite.status_code = 408
7675 s.dendrite.status_message = "Timeout"
7676 synapse.dendrite.process_time = str(timeout)
7677
7678 # Return the updated synapse object after deserializing if requested
7679 if deserialize:
7680 return s.deserialize()
7681 else:
7682 return s
7683
7684 return await asyncio.gather(
7685 *(single_axon_response(i, target_axon) for i, target_axon in enumerate(axons))
7686 )
7687
7688 return await query_all_axons(streaming)
7689
7690 def __str__(self) -> str:
7691 """
7692 Returns a string representation of the Dendrite object.
7693
7694 Returns:
7695 str: The string representation of the Dendrite object in the format "dendrite(<user_wallet_address>)".
7696 """
7697 return "MockDendrite({})".format(self.keypair.ss58_address)
7698
7699
7700---
7701File: /omega/protocol.py
7702---
7703
7704# The MIT License (MIT)
7705# Copyright © 2023 Yuma Rao
7706# Copyright © 2023 Omega Labs, Inc.
7707
7708# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
7709# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
7710# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
7711# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
7712
7713# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
7714# the Software.
7715
7716# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
7717# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
7718# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
7719# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
7720# DEALINGS IN THE SOFTWARE.
7721
7722import typing
7723import json
7724
7725import bittensor as bt
7726from pydantic import BaseModel
7727
7728
7729class VideoMetadata(BaseModel):
7730 """
7731 A model class representing YouTube video metadata.
7732 """
7733 video_id: str
7734 description: str
7735 views: int
7736 start_time: int
7737 end_time: int
7738 video_emb: typing.List[float]
7739 audio_emb: typing.List[float]
7740 description_emb: typing.List[float]
7741
7742 def __repr_args__(self):
7743 parent_args = super().__repr_args__()
7744 exclude_args = ['video_emb', 'audio_emb', 'description_emb']
7745 return (
7746 [(a, v) for a, v in parent_args if a not in exclude_args] +
7747 [(a, ["..."]) for a in exclude_args]
7748 )
7749
7750
7751class Videos(bt.Synapse):
7752 """
7753 A synapse class representing a video scraping request and response.
7754
7755 Attributes:
7756 - query: the input query for which to find relevant videos
7757 - num_videos: the number of videos to return
7758 - video_metadata: a list of video metadata objects
7759 """
7760
7761 query: str
7762 num_videos: int
7763 video_metadata: typing.Optional[typing.List[VideoMetadata]] = None
7764
7765 def deserialize(self) -> typing.List[VideoMetadata]:
7766 assert self.video_metadata is not None
7767 return self.video_metadata
7768
7769 def to_serializable_dict(self, input_synapse: "Videos") -> dict:
7770 """
7771 Dumps the Videos object to a serializable dict, but makes sure to use input properties from
7772 the input_synapse, while taking the non-null output property video_metadata from the
7773 response (self).
7774 """
7775 json_str = self.replace_with_input(input_synapse).json(
7776 include={"query", "num_videos", "video_metadata"})
7777 return json.loads(json_str)
7778
7779 def replace_with_input(self, input_synapse: "Videos") -> "Videos":
7780 """
7781 Replaces the query and num_videos of current synapse with the given input synapse.
7782 """
7783 return Videos(
7784 query=input_synapse.query,
7785 num_videos=input_synapse.num_videos,
7786 video_metadata=self.video_metadata[:input_synapse.num_videos],
7787 axon=self.axon
7788 )
7789
7790
7791
7792
7793class AudioMetadata(BaseModel):
7794 video_id: str
7795 views: int
7796 start_time: int
7797 end_time: int
7798 audio_emb: typing.List[float]
7799 audio_bytes: typing.Optional[str] = None
7800 diar_timestamps_start: typing.List[float]
7801 diar_timestamps_end: typing.List[float]
7802 diar_speakers: typing.List[str]
7803
7804 def __repr_args__(self):
7805 parent_args = super().__repr_args__()
7806 exclude_args = ['audio_emb', 'audio_bytes', 'diar_timestamps_start', 'diar_timestamps_end', 'diar_speakers']
7807 return (
7808 [(a, v) for a, v in parent_args if a not in exclude_args] +
7809 [(a, ["..."]) for a in exclude_args]
7810 )
7811
7812
7813class Audios(bt.Synapse):
7814 """
7815 A synapse class representing an audio request and response.
7816
7817 Attributes:
7818 - query: the input query for which to find relevant videos
7819 - num_audios: the number of audios to return
7820 - audio_metadata: an audio metadata object
7821 """
7822
7823 query: str
7824 num_audios: int
7825 audio_metadata: typing.Optional[typing.List[AudioMetadata]] = None
7826
7827 def deserialize(self) -> typing.List[AudioMetadata]:
7828 assert self.audio_metadata is not None
7829 return self.audio_metadata
7830
7831 def to_serializable_dict(self, input_synapse: "Audios") -> dict:
7832 """
7833 Dumps the Audio object to a serializable dict, but makes sure to use input properties from
7834 the input_synapse, while taking the non-null output property audio_metadata from the
7835 response (self).
7836 """
7837 json_str = self.replace_with_input(input_synapse).json(
7838 include={"query", "num_audios", "audio_metadata"})
7839 return json.loads(json_str)
7840
7841 def replace_with_input(self, input_synapse: "Audios") -> "Audios":
7842 """
7843 Replaces the query and num_audios of current synapse with the given input synapse.
7844 """
7845 return Audios(
7846 query=input_synapse.query,
7847 num_audios=input_synapse.num_audios,
7848 audio_metadata=self.audio_metadata,
7849 axon=self.axon
7850 )
7851
7852
7853
7854---
7855File: /omega/subnet_links.py
7856---
7857
7858SUBNET_LINKS = [
7859 {"name": "sn0", "url": ""},
7860 {"name": "sn1", "url": "https://github.com/opentensor/text-prompting/"},
7861 {"name": "sn2", "url": "https://github.com/bittranslateio/bittranslate/"},
7862 {
7863 "name": "sn3",
7864 "url": "https://github.com/gitphantomman/scraping_subnet/",
7865 },
7866 {"name": "sn4", "url": "https://github.com/manifold-inc/targon/"},
7867 {"name": "sn5", "url": "https://github.com/unconst/ImageSubnet/"},
7868 {"name": "sn6", "url": ""},
7869 {"name": "sn7", "url": "https://github.com/tensorage/tensorage/"},
7870 {
7871 "name": "sn8",
7872 "url": "https://github.com/taoshidev/time-series-prediction-subnet/",
7873 },
7874 {"name": "sn9", "url": "https://github.com/unconst/pretrain-subnet/"},
7875 {
7876 "name": "sn10",
7877 "url": "https://github.com/dream-well/map-reduce-subnet/",
7878 },
7879 {"name": "sn11", "url": "https://github.com/opentensor/text-prompting/"},
7880 {"name": "sn12", "url": ""},
7881 {"name": "sn13", "url": "https://github.com/RusticLuftig/data-universe/"},
7882 {
7883 "name": "sn14",
7884 "url": "https://github.com/ceterum1/llm-defender-subnet/",
7885 },
7886 {
7887 "name": "sn15",
7888 "url": "https://github.com/blockchain-insights/blockchain-data-subnet/",
7889 },
7890 {"name": "sn16", "url": "https://github.com/UncleTensor/AudioSubnet/"},
7891 {"name": "sn17", "url": "https://github.com/CortexLM/flavia/"},
7892 {"name": "sn18", "url": "https://github.com/corcel-api/cortex.t/"},
7893 {"name": "sn19", "url": "https://github.com/namoray/vision/"},
7894 {"name": "sn20", "url": "https://github.com/oracle-subnet/oracle-subnet/"},
7895 {"name": "sn21", "url": "https://github.com/ifrit98/storage-subnet/"},
7896 {"name": "sn22", "url": "https://github.com/surcyf123/smart-scrape/"},
7897 {"name": "sn23", "url": "https://github.com/NicheTensor/NicheImage/"},
7898 {"name": "sn24", "url": "https://github.com/eseckft/BitAds.ai/tree/main"},
7899 {"name": "sn25", "url": "https://github.com/KMFODA/DistributedTraining/"},
7900 {
7901 "name": "sn26",
7902 "url": "https://github.com/Supreme-Emperor-Wang/ImageAlchemy/",
7903 },
7904 {
7905 "name": "sn27",
7906 "url": "https://github.com/neuralinternet/compute-subnet/",
7907 },
7908 {"name": "sn28", "url": "https://github.com/zktensor/zktensor_subnet/"},
7909 {"name": "sn29", "url": "https://github.com/404-Repo/Subnet-29/"},
7910 {"name": "sn30", "url": ""},
7911 {
7912 "name": "sn31",
7913 "url": "https://github.com/bthealthcare/healthcare-subnet",
7914 },
7915 {"name": "sn32", "url": "https://github.com/RoyalTensor/roleplay/"},
7916]
7917
7918
7919
7920---
7921File: /omega/test_audio.py
7922---
7923
7924from omega.video_utils import get_audio_bytes
7925import base64
7926audio_bytes = get_audio_bytes("test_video.mp4")
7927print(audio_bytes)
7928
7929# Save audio bytes to a WAV file
7930with open('output_audio.wav', 'wb') as f:
7931 f.write(audio_bytes)
7932
7933audio_bytes_b64 = base64.b64encode(audio_bytes).decode('utf-8')
7934print(audio_bytes_b64)
7935# Save base64 encoded audio to file
7936with open('output_audio_b64.txt', 'w') as f:
7937 f.write(audio_bytes_b64)
7938
7939
7940
7941---
7942File: /omega/text_similarity.py
7943---
7944
7945import torch
7946import torch.nn.functional as F
7947from transformers import AutoModel, AutoTokenizer
7948
7949model_path = "Alibaba-NLP/gte-large-en-v1.5"
7950revision = "104333d6af6f97649377c2afbde10a7704870c7b"
7951TOKENIZER = AutoTokenizer.from_pretrained(model_path, revision=revision)
7952DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
7953MODEL = AutoModel.from_pretrained(model_path, trust_remote_code=True, revision=revision).to(DEVICE)
7954MODEL.eval()
7955
7956def get_text_similarity_score(text_0, text_1):
7957 tokens = TOKENIZER([text_0, text_1], max_length=1024, padding=True, truncation=True, return_tensors='pt').to(DEVICE)
7958 outputs = MODEL(**tokens)
7959 embeddings = outputs.last_hidden_state[:, 0]
7960 embeddings = F.normalize(embeddings, p=2, dim=1)
7961 scores = (embeddings[:1] @ embeddings[1:].T) * 100
7962 return min(0.5, (scores.tolist()[0][0] / 100) ** 2)
7963
7964
7965
7966---
7967File: /omega/unstuff.py
7968---
7969
7970import torch
7971from transformers import pipeline
7972from typing import Tuple
7973import bittensor as bt
7974import random
7975import torch.nn.functional as F
7976from omega.imagebind_wrapper import (
7977 split_text_by_token_limit,
7978 SimpleTokenizer,
7979 BPE_PATH,
7980 split_text_by_token_limit,
7981)
7982CHUNK_SIZE = 60
7983TOKENIZER = SimpleTokenizer(bpe_path=BPE_PATH, context_length=10000)
7984
7985UNSTUFF = pipeline("text-classification", "jondurbin/unstuffer-v0.2", device="cuda" if torch.cuda.is_available() else "cpu")
7986
7987def is_stuffed(description: str) -> Tuple[bool, float]:
7988 result = UNSTUFF(description, truncation=True, max_length=512)
7989 stuffed = False if int(result[0]["label"]) == 1 else True
7990 confidence = result[0]["score"]
7991 if stuffed and confidence > 0.75:
7992 print(f"Detected stuffed description [{confidence=}]: {description}")
7993 elif not stuffed and random.random() <= 0.01:
7994 print(f"Description does not appear to be stuffed [{confidence=}]: {description}")
7995 return stuffed, confidence
7996
7997def check_extraneous_chunks(description, video_emb, audio_emb, imagebind):
7998 bt.logging.info(f"Length of description: {len(description)}")
7999 bt.logging.info(f"Length of video_emb: {len(video_emb)}")
8000 bt.logging.info(f"Length of audio_emb: {len(audio_emb)}")
8001 text_chunks = [
8002 chunk
8003 for chunk in split_text_by_token_limit(description, TOKENIZER, CHUNK_SIZE)
8004 if len(TOKENIZER(chunk)) >= 5
8005 ]
8006 if len(text_chunks) <= 1:
8007 return 0.0, 0.0, 0.0
8008 similarities = []
8009 for text in text_chunks:
8010 text_emb = imagebind.embed_text([text]).to("cpu")
8011 v_cosim = F.cosine_similarity(
8012 torch.tensor(video_emb).unsqueeze(0), text_emb
8013 ).tolist()[0]
8014 a_cosim = F.cosine_similarity(
8015 torch.tensor(audio_emb).unsqueeze(0), text_emb
8016 ).tolist()[0]
8017 similarities.append((v_cosim + a_cosim) / 2)
8018 best = max(similarities)
8019 low_quality = 0
8020 really_bad = 0
8021 for idx in range(len(similarities)):
8022 similarity = similarities[idx]
8023 text = text_chunks[idx]
8024 if similarity < best * 0.6:
8025 low_quality += 1
8026 if similarity < 0.12:
8027 really_bad += 1
8028 return really_bad, low_quality, len(similarities)
8029
8030
8031
8032---
8033File: /omega/video_utils.py
8034---
8035
8036import re
8037import json
8038import os
8039import tempfile
8040from typing import Optional, BinaryIO
8041import requests
8042import bittensor as bt
8043import ffmpeg
8044from pydantic import BaseModel
8045from yt_dlp import YoutubeDL
8046import librosa
8047import numpy as np
8048
8049from omega.constants import FIVE_MINUTES
8050
8051
8052def seconds_to_str(seconds):
8053 hours = seconds // 3600
8054 minutes = (seconds % 3600) // 60
8055 seconds = seconds % 60
8056 return f"{hours:02}:{minutes:02}:{seconds:02}"
8057
8058
8059def clip_video(video_path: str, start: int, end: int) -> Optional[BinaryIO]:
8060 temp_fileobj = tempfile.NamedTemporaryFile(suffix=".mp4")
8061 (
8062 ffmpeg
8063 .input(video_path, ss=seconds_to_str(start), to=seconds_to_str(end))
8064 .output(temp_fileobj.name, c="copy") # copy flag prevents decoding and re-encoding
8065 .overwrite_output()
8066 .run(quiet=True)
8067 )
8068 return temp_fileobj
8069
8070
8071def skip_live(info_dict):
8072 """
8073 function to skip downloading if it's a live video (yt_dlp doesn't respect the 20 minute
8074 download limit for live videos), and we don't want to hang on an hour long stream
8075 """
8076 if info_dict.get("is_live"):
8077 return "Skipping live video"
8078 return None
8079
8080
8081class YoutubeResult(BaseModel):
8082 video_id: str
8083 title: str
8084 description: Optional[str]
8085 length: int
8086 views: int
8087
8088
8089def search_videos(query, max_results=8):
8090 videos = []
8091 ydl_opts = {
8092 "format": "worst",
8093 "dumpjson": True,
8094 "extract_flat": True,
8095 "quiet": True,
8096 "simulate": True,
8097 "match_filter": skip_live,
8098 }
8099 with YoutubeDL(ydl_opts) as ydl:
8100 try:
8101 search_query = f"ytsearch{max_results}:{query}"
8102 result = ydl.extract_info(search_query, download=False)
8103 if "entries" in result and result["entries"]:
8104 videos = [
8105 YoutubeResult(
8106 video_id=entry["id"],
8107 title=entry["title"],
8108 description=entry.get("description"),
8109 length=(int(entry.get("duration")) if entry.get("duration") else FIVE_MINUTES),
8110 views=(entry.get("view_count") if entry.get("view_count") else 0),
8111 ) for entry in result["entries"]
8112 ]
8113 except Exception as e:
8114 bt.logging.warning(f"Error searching for videos: {e}")
8115 return []
8116 return videos
8117
8118
8119def get_video_duration(filename: str) -> int:
8120 metadata = ffmpeg.probe(filename)
8121 video_stream = next((stream for stream in metadata['streams'] if stream['codec_type'] == 'video'), None)
8122 duration = int(float(video_stream['duration']))
8123 return duration
8124
8125
8126class IPBlockedException(Exception):
8127 def __init__(self, message: str):
8128 super().__init__(message)
8129
8130
8131class FakeVideoException(Exception):
8132 def __init__(self, message: str):
8133 super().__init__(message)
8134
8135
8136def is_valid_youtube_id(youtube_id: str) -> bool:
8137 return youtube_id is not None and len(youtube_id) == 11
8138
8139def download_youtube_video(
8140 video_id: str, start: Optional[int]=None, end: Optional[int]=None, proxy: Optional[str]=None
8141) -> Optional[BinaryIO]:
8142 if not is_valid_youtube_id(video_id):
8143 raise FakeVideoException(f"Invalid Youtube video ID: {video_id}")
8144
8145 video_url = f"https://www.youtube.com/watch?v={video_id}"
8146
8147 temp_fileobj = tempfile.NamedTemporaryFile(suffix=".mp4")
8148
8149 ydl_opts = {
8150 "format": "worst", # Download the worst quality
8151 "outtmpl": temp_fileobj.name, # Set the output template to the temporary file"s name
8152 "overwrites": True,
8153 "quiet": True,
8154 "noprogress": True,
8155 "match_filter": skip_live,
8156 }
8157
8158 if start is not None and end is not None:
8159 ydl_opts["download_ranges"] = lambda _, __: [{"start_time": start, "end_time": end}]
8160
8161 if proxy is not None:
8162 ydl_opts["proxy"] = proxy
8163
8164 try:
8165 with YoutubeDL(ydl_opts) as ydl:
8166 ydl.download([video_url])
8167
8168 # Check if the file is empty (download failed)
8169 if os.stat(temp_fileobj.name).st_size == 0:
8170 print(f"Error downloading Youtube video: {temp_fileobj.name} is empty")
8171 temp_fileobj.close()
8172 return None
8173
8174 return temp_fileobj
8175 except Exception as e:
8176 temp_fileobj.close()
8177 if (
8178 "Your IP is likely being blocked by Youtube" in str(e) or
8179 "Requested format is not available" in str(e)
8180 ):
8181 raise IPBlockedException(e)
8182
8183 # Quick check to see if miner passed an "unplayable" (sign-in required, paid video, etc.).
8184 fake_video = False
8185 try:
8186 result = requests.get(video_url, proxies={"https": proxy})
8187 json_match = re.search(r"ytInitialPlayerResponse\s*=\s*(\{(?:.*?)\})\s*;\s*<", result.text)
8188 if json_match:
8189 player_info = json.loads(json_match.group(1))
8190 status = player_info.get('playabilityStatus', {}).get('status', 'ok')
8191 unacceptable_statuses = ('UNPLAYABLE',)
8192 if status in unacceptable_statuses or (status == 'ERROR' and player_info['playabilityStatus'].get('reason', '').lower() == 'video unavailable'):
8193 if "sign in to confirm you’re not a bot" not in result.text.lower():
8194 if player_info['playabilityStatus']['errorScreen']['playerErrorMessageRenderer']['subreason']['simpleText'] != "This content isn’t available.":
8195 fake_video = True
8196 print(f"Fake video submitted, youtube player status [{status}]: {player_info['playabilityStatus']}")
8197 except Exception as fake_check_exc:
8198 print(f"Error sanity checking playability: {fake_check_exc}")
8199 if fake_video:
8200 raise FakeVideoException("Unplayable video provided")
8201 if any(fake_vid_msg in str(e) for fake_vid_msg in ["Video unavailable", "is not a valid URL", "Incomplete YouTube ID", "Unsupported URL"]):
8202 if "Video unavailable. This content isn’t available." not in str(e):
8203 raise FakeVideoException(e)
8204 print(f"Error downloading video: {e}")
8205 return None
8206
8207
8208def copy_audio(video_path: str) -> BinaryIO:
8209 temp_audiofile = tempfile.NamedTemporaryFile(suffix=".aac")
8210 (
8211 ffmpeg
8212 .input(video_path)
8213 .output(temp_audiofile.name, vn=None, acodec='copy')
8214 .overwrite_output()
8215 .run(quiet=True)
8216 )
8217 return temp_audiofile
8218
8219def copy_audio_wav(video_path: str) -> BinaryIO:
8220 """
8221 Extract audio from video file to 16-bit PCM WAV format.
8222
8223 Args:
8224 video_path: Path to input video
8225
8226 Returns:
8227 BinaryIO: Temporary file containing WAV audio
8228 """
8229 temp_audiofile = tempfile.NamedTemporaryFile(suffix=".wav")
8230
8231 (
8232 ffmpeg
8233 .input(video_path)
8234 .output(
8235 temp_audiofile.name,
8236 acodec='pcm_s16le', # 16-bit PCM
8237 ac=1, # mono
8238 ar=16000, # 16kHz sample rate
8239 vn=None # no video
8240 )
8241 .overwrite_output()
8242 .run(quiet=True)
8243 )
8244
8245 return temp_audiofile
8246
8247def get_audio_bytes(video_path: str) -> bytes:
8248 audio_file = copy_audio_wav(video_path)
8249 with open(audio_file.name, 'rb') as f:
8250 wav_bytes = f.read()
8251
8252 # Clean up temp file
8253 audio_file.close()
8254
8255 # NOTE: MINERS, you cannot change the sample rate here or we will not be able to score your audio
8256 return wav_bytes
8257
8258
8259
8260---
8261File: /scripts/check_compatibility.sh
8262---
8263
8264#!/bin/bash
8265
8266if [ -z "$1" ]; then
8267 echo "Please provide a Python version as an argument."
8268 exit 1
8269fi
8270
8271python_version="$1"
8272all_passed=true
8273
8274GREEN='\033[0;32m'
8275YELLOW='\033[0;33m'
8276RED='\033[0;31m'
8277NC='\033[0m' # No Color
8278
8279check_compatibility() {
8280 all_supported=0
8281
8282 while read -r requirement; do
8283 # Skip lines starting with git+
8284 if [[ "$requirement" == git+* ]]; then
8285 continue
8286 fi
8287
8288 package_name=$(echo "$requirement" | awk -F'[!=<>]' '{print $1}' | awk -F'[' '{print $1}') # Strip off brackets
8289 echo -n "Checking $package_name... "
8290
8291 url="https://pypi.org/pypi/$package_name/json"
8292 response=$(curl -s $url)
8293 status_code=$(curl -s -o /dev/null -w "%{http_code}" $url)
8294
8295 if [ "$status_code" != "200" ]; then
8296 echo -e "${RED}Information not available for $package_name. Failure.${NC}"
8297 all_supported=1
8298 continue
8299 fi
8300
8301 classifiers=$(echo "$response" | jq -r '.info.classifiers[]')
8302 requires_python=$(echo "$response" | jq -r '.info.requires_python')
8303
8304 base_version="Programming Language :: Python :: ${python_version%%.*}"
8305 specific_version="Programming Language :: Python :: $python_version"
8306
8307 if echo "$classifiers" | grep -q "$specific_version" || echo "$classifiers" | grep -q "$base_version"; then
8308 echo -e "${GREEN}Supported${NC}"
8309 elif [ "$requires_python" != "null" ]; then
8310 if echo "$requires_python" | grep -Eq "==$python_version|>=$python_version|<=$python_version"; then
8311 echo -e "${GREEN}Supported${NC}"
8312 else
8313 echo -e "${RED}Not compatible with Python $python_version due to constraint $requires_python.${NC}"
8314 all_supported=1
8315 fi
8316 else
8317 echo -e "${YELLOW}Warning: Specific version not listed, assuming compatibility${NC}"
8318 fi
8319 done < requirements.txt
8320
8321 return $all_supported
8322}
8323
8324echo "Checking compatibility for Python $python_version..."
8325check_compatibility
8326if [ $? -eq 0 ]; then
8327 echo -e "${GREEN}All requirements are compatible with Python $python_version.${NC}"
8328else
8329 echo -e "${RED}All requirements are NOT compatible with Python $python_version.${NC}"
8330 all_passed=false
8331fi
8332
8333echo ""
8334if $all_passed; then
8335 echo -e "${GREEN}All tests passed.${NC}"
8336else
8337 echo -e "${RED}All tests did not pass.${NC}"
8338 exit 1
8339fi
8340
8341
8342
8343---
8344File: /scripts/check_requirements_changes.sh
8345---
8346
8347#!/bin/bash
8348
8349# Check if requirements files have changed in the last commit
8350if git diff --name-only HEAD~1 | grep -E 'requirements.txt|requirements.txt'; then
8351 echo "Requirements files have changed. Running compatibility checks..."
8352 echo 'export REQUIREMENTS_CHANGED="true"' >> $BASH_ENV
8353else
8354 echo "Requirements files have not changed. Skipping compatibility checks..."
8355 echo 'export REQUIREMENTS_CHANGED="false"' >> $BASH_ENV
8356fi
8357
8358
8359
8360---
8361File: /scripts/install_staging.sh
8362---
8363
8364#!/bin/bash
8365
8366# Section 1: Build/Install
8367# This section is for first-time setup and installations.
8368
8369install_dependencies() {
8370 # Function to install packages on macOS
8371 install_mac() {
8372 which brew > /dev/null
8373 if [ $? -ne 0 ]; then
8374 echo "Installing Homebrew..."
8375 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
8376 fi
8377 echo "Updating Homebrew packages..."
8378 brew update
8379 echo "Installing required packages..."
8380 brew install make llvm curl libssl protobuf tmux
8381 }
8382
8383 # Function to install packages on Ubuntu/Debian
8384 install_ubuntu() {
8385 echo "Updating system packages..."
8386 sudo apt update
8387 echo "Installing required packages..."
8388 sudo apt install --assume-yes make build-essential git clang curl libssl-dev llvm libudev-dev protobuf-compiler tmux
8389 }
8390
8391 # Detect OS and call the appropriate function
8392 if [[ "$OSTYPE" == "darwin"* ]]; then
8393 install_mac
8394 elif [[ "$OSTYPE" == "linux-gnu"* ]]; then
8395 install_ubuntu
8396 else
8397 echo "Unsupported operating system."
8398 exit 1
8399 fi
8400
8401 # Install rust and cargo
8402 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
8403
8404 # Update your shell's source to include Cargo's path
8405 source "$HOME/.cargo/env"
8406}
8407
8408# Call install_dependencies only if it's the first time running the script
8409if [ ! -f ".dependencies_installed" ]; then
8410 install_dependencies
8411 touch .dependencies_installed
8412fi
8413
8414
8415# Section 2: Test/Run
8416# This section is for running and testing the setup.
8417
8418# Create a coldkey for the owner role
8419wallet=${1:-owner}
8420
8421# Logic for setting up and running the environment
8422setup_environment() {
8423 # Clone subtensor and enter the directory
8424 if [ ! -d "subtensor" ]; then
8425 git clone https://github.com/opentensor/subtensor.git
8426 fi
8427 cd subtensor
8428 git pull
8429
8430 # Update to the nightly version of rust
8431 ./scripts/init.sh
8432
8433 cd ../bittensor-subnet-template
8434
8435 # Install the bittensor-subnet-template python package
8436 python -m pip install -e .
8437
8438 # Create and set up wallets
8439 # This section can be skipped if wallets are already set up
8440 if [ ! -f ".wallets_setup" ]; then
8441 btcli wallet new_coldkey --wallet.name $wallet --no_password --no_prompt
8442 btcli wallet new_coldkey --wallet.name miner --no_password --no_prompt
8443 btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default --no_prompt
8444 btcli wallet new_coldkey --wallet.name validator --no_password --no_prompt
8445 btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default --no_prompt
8446 touch .wallets_setup
8447 fi
8448
8449}
8450
8451# Call setup_environment every time
8452setup_environment
8453
8454## Setup localnet
8455# assumes we are in the bittensor-subnet-template/ directory
8456# Initialize your local subtensor chain in development mode. This command will set up and run a local subtensor network.
8457cd ../subtensor
8458
8459# Start a new tmux session and create a new pane, but do not switch to it
8460echo "FEATURES='pow-faucet runtime-benchmarks' BT_DEFAULT_TOKEN_WALLET=$(cat ~/.bittensor/wallets/$wallet/coldkeypub.txt | grep -oP '"ss58Address": "\K[^"]+') bash scripts/localnet.sh" >> setup_and_run.sh
8461chmod +x setup_and_run.sh
8462tmux new-session -d -s localnet -n 'localnet'
8463tmux send-keys -t localnet 'bash ../subtensor/setup_and_run.sh' C-m
8464
8465# Notify the user
8466echo ">> localnet.sh is running in a detached tmux session named 'localnet'"
8467echo ">> You can attach to this session with: tmux attach-session -t localnet"
8468
8469# Register a subnet (this needs to be run each time we start a new local chain)
8470btcli subnet create --wallet.name $wallet --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8471
8472# Transfer tokens to miner and validator coldkeys
8473export BT_MINER_TOKEN_WALLET=$(cat ~/.bittensor/wallets/miner/coldkeypub.txt | grep -oP '"ss58Address": "\K[^"]+')
8474export BT_VALIDATOR_TOKEN_WALLET=$(cat ~/.bittensor/wallets/validator/coldkeypub.txt | grep -oP '"ss58Address": "\K[^"]+')
8475
8476btcli wallet transfer --subtensor.network ws://127.0.0.1:9946 --wallet.name $wallet --dest $BT_MINER_TOKEN_WALLET --amount 1000 --no_prompt
8477btcli wallet transfer --subtensor.network ws://127.0.0.1:9946 --wallet.name $wallet --dest $BT_VALIDATOR_TOKEN_WALLET --amount 10000 --no_prompt
8478
8479# Register wallet hotkeys to subnet
8480btcli subnet register --wallet.name miner --netuid 1 --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8481btcli subnet register --wallet.name validator --netuid 1 --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8482
8483# Add stake to the validator
8484btcli stake add --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --amount 10000 --no_prompt
8485
8486# Ensure both the miner and validator keys are successfully registered.
8487btcli subnet list --subtensor.chain_endpoint ws://127.0.0.1:9946
8488btcli wallet overview --wallet.name validator --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8489btcli wallet overview --wallet.name miner --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8490
8491cd ../bittensor-subnet-template
8492
8493
8494# Check if inside a tmux session
8495if [ -z "$TMUX" ]; then
8496 # Start a new tmux session and run the miner in the first pane
8497 tmux new-session -d -s bittensor -n 'miner' 'python neurons/miner.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name miner --wallet.hotkey default --logging.debug'
8498
8499 # Split the window and run the validator in the new pane
8500 tmux split-window -h -t bittensor:miner 'python neurons/validator.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name validator --wallet.hotkey default --logging.debug'
8501
8502 # Attach to the new tmux session
8503 tmux attach-session -t bittensor
8504else
8505 # If already in a tmux session, create two panes in the current window
8506 tmux split-window -h 'python neurons/miner.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name miner --wallet.hotkey default --logging.debug'
8507 tmux split-window -v -t 0 'python neurons/validator.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name3 validator --wallet.hotkey default --logging.debug'
8508fi
8509
8510
8511
8512---
8513File: /validator-api/static/dashboard.html
8514---
8515
8516<!DOCTYPE html>
8517<html lang="en">
8518<head>
8519 <meta charset="UTF-8">
8520 <meta name="viewport" content="width=device-width, initial-scale=1.0">
8521 <link rel="icon" href="static/favicon.ico" />
8522 <title>OMEGA Metadata Dashboard</title>
8523 <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/vue.min.js"></script>
8524 <script src="https://cdn.jsdelivr.net/npm/axios/dist/axios.min.js"></script>
8525 <style>
8526 /* Apply base font styles */
8527 html, body {
8528 font-family: Roboto, sans-serif;
8529 line-height: 1.5;
8530 height: 101%; /* Ensure the html and body elements take up the full height of the window */
8531 margin: 0; /* Reset any default margin */
8532 padding: 0; /* Reset any default padding */
8533 }
8534
8535 body {
8536 font-size: 16px;
8537 line-height: 1.6;
8538 font-weight: 400;
8539 background-color: #0a1128;
8540 background-image:
8541 linear-gradient(
8542 to bottom,
8543 rgba(255, 255, 255, 0) 0%, /* Fully transparent */
8544 rgba(255, 255, 255, 0) calc(100% - 700px), /* Transparent until 200px from the bottom */
8545 #0a1128 calc(100% - 200px),
8546 #0a1128 100% /* Transition to the background color over the last 200px */
8547 ),
8548 url(https://omegatron.ai/static/images/0423e77f5905b1f1bccb.png);
8549 background-size: cover;
8550 background-repeat: no-repeat;
8551 background-position: center;
8552 color: #ffffff; /* Light text color for better readability */
8553 }
8554 /*
8555 body::before {
8556 position: absolute;
8557 content: "";
8558 width: 100%;
8559 height: 100%;
8560 top: 0;
8561 left: 0;
8562 background-image: linear-gradient(to bottom, #0a1128 0%, rgba(10, 17, 40, 0.8078431373) 30%, rgba(10, 17, 40, 0.5607843137) 60%, rgba(10, 17, 40, 0.1450980392) 95%) !important;
8563 z-index: 1;
8564 }*/
8565 .logo {
8566 display: block; /* Use block to apply margin auto for centering */
8567 width: 75px; /* Set the width of the logo container */
8568 height: 75px; /* Set the height of the logo container */
8569 margin: 0 auto; /* Center the logo horizontally */
8570 margin-top: 2rem; /* Add space above the logo */
8571 }
8572
8573 .logo svg {
8574 width: 100%; /* Make the SVG fill the container */
8575 height: 100%; /* Make the SVG fill the container */
8576 }
8577
8578 h1 {
8579 text-align: center;
8580 font-size: 2.5rem;
8581 margin-bottom: 3rem;
8582 margin-top: 0;
8583 text-shadow: 3px 3px 4px rgba(0, 0, 0, 0.75);
8584 }
8585
8586 /* Table styles */
8587 table {
8588 width: 90%;
8589 margin: 0 auto; /* Center table horizontally */
8590 border-collapse: collapse;
8591 text-indent: 0;
8592 color: #ffffff; /* Ensure table text is light-colored */
8593 border-radius: 10px; /* Rounded corners */
8594 box-shadow: 4px 4px 8px 0 rgba(70, 70, 70, 0.3); /* Drop shadow */
8595 }
8596
8597 th.center {
8598 text-align: center;
8599 }
8600
8601 .width520 {
8602 width: 520px;
8603 }
8604
8605 .width20 {
8606 width: 20px;
8607 }
8608
8609 /* Style for table headers and cells to inherit the rounded corners */
8610 th, td {
8611 /*border: 1px solid #ddd; Light gray border for cells */
8612 padding: 8px; /* Padding for cell content */
8613 text-align: left;
8614 width: 10%;
8615 }
8616
8617 td {
8618 cursor: pointer;
8619 }
8620
8621 th {
8622 background-color: #272727; /* Dark background for headers */
8623 color: #ffffff; /* Light text color for headers */
8624 font-weight: bold; /* Bold font weight for better readability */
8625 }
8626
8627 /* Style for the first and last cells in each row to inherit the rounded corners */
8628 th:first-child {
8629 border-top-left-radius: 10px; /* Top-left rounded corner */
8630 }
8631
8632 th:last-child {
8633 border-top-right-radius: 10px; /* Top-right rounded corner */
8634 }
8635
8636 /* Style for the last row to inherit the rounded corners */
8637 tr:last-child td:first-child {
8638 border-bottom-left-radius: 10px; /* Bottom-left rounded corner */
8639 }
8640
8641 tr:last-child td:last-child {
8642 border-bottom-right-radius: 10px; /* Bottom-right rounded corner */
8643 }
8644
8645 /* Body styles */
8646 tbody tr:nth-child(odd) {
8647 background-color: #162035; /* Dark background for odd rows */
8648 }
8649
8650 tbody tr:nth-child(even) {
8651 background-color: #1f2a48; /* Slightly different dark background for even rows */
8652 }
8653
8654 /* Footer styles */
8655 tfoot {
8656 font-weight: bold;
8657 background-color: #1f2a48; /* Consistent background for footer */
8658 }
8659
8660 .refresh-icon {
8661 cursor: pointer;
8662 }
8663
8664 .sortable {
8665 cursor: pointer;
8666 }
8667
8668 .arrow {
8669 display: inline-block;
8670 margin-left: 5px;
8671 }
8672
8673 .arrow-up::before {
8674 content: '▲';
8675 }
8676
8677 .arrow-down::before {
8678 content: '▼';
8679 }
8680
8681 input[type="text"] {
8682 width: 30%; /* Match the table width or adjust as needed */
8683 padding: 10px; /* Larger padding for a taller input field */
8684 margin-bottom: 20px; /* Space between the input field and the table */
8685 font-size: 16px; /* Larger font size for better readability */
8686 border: 1px solid #ccc; /* Subtle border color */
8687 border-radius: 5px; /* Slightly rounded corners */
8688 box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.1); /* Inner shadow for depth */
8689 display: block; /* Ensure it's a block-level element */
8690 margin-left: auto; /* Combined with margin-right: auto, centers the input */
8691 margin-right: auto;
8692 }
8693
8694 .input-social-container {
8695 display: flex;
8696 align-items: center;
8697 justify-content: space-between;
8698 }
8699
8700 .social-icons {
8701 position: absolute;
8702 right: 5%;
8703 display: flex;
8704 align-items: center;
8705 }
8706
8707 .social-icons button {
8708 background: none;
8709 border: none;
8710 cursor: pointer;
8711 }
8712
8713 .social-icon {
8714 display: flex;
8715 justify-content: center;
8716 align-items: center;
8717 width: 50px; /* Adjust size as needed */
8718 height: 50px; /* Adjust size as needed */
8719 border-radius: 50%; /* Make it circular */
8720 border: 1px solid #ccc; /* Light gray border */
8721 margin-left: 15px; /* Space between icons */
8722 overflow: hidden; /* Ensure the content fits the circular shape */
8723 margin-bottom: 2em;
8724 }
8725
8726 .social-icon img,
8727 .social-icon svg {
8728 width: 100%;
8729 height: 100%;
8730 display: block;
8731 object-fit: cover; /* Ensure the image covers the area */
8732 }
8733
8734 .youtube-embed {
8735 width: 100%;
8736 height: 315px;
8737 }
8738
8739 .pagination {
8740 display: flex;
8741 justify-content: center;
8742 align-items: center;
8743 margin-top: 20px; /* Adjust the margin as needed */
8744 padding-top: 10px; /* Adjust the padding as needed */
8745 }
8746
8747 .pagination button {
8748 background-color: #068AC7;
8749 color: white;
8750 border: none;
8751 padding: 10px 20px;
8752 margin: 0 5px;
8753 cursor: pointer;
8754 border-radius: 5px;
8755 font-size: 16px;
8756 }
8757
8758 .pagination button:disabled {
8759 background-color: #cccccc;
8760 cursor: not-allowed;
8761 }
8762
8763 .pagination span {
8764 font-size: 16px;
8765 margin: 0 10px;
8766 }
8767
8768 /* Responsive styles for smaller screens */
8769 @media (max-width: 768px) {
8770 body {
8771 font-size: 0.9em; /* Smaller font size on mobile */
8772 }
8773
8774 h1 {
8775 font-size: 1.5rem; /* Adjust heading size for mobile */
8776 }
8777
8778 .logo {
8779 width: 30%; /* Increase width percentage for smaller screens */
8780 }
8781
8782 input[type="text"] {
8783 width: 80%; /* Increase width for mobile */
8784 padding: 8px; /* Adjust padding */
8785 font-size: 1em; /* Adjust font size */
8786 }
8787
8788 table {
8789 width: 100%; /* Full width on mobile */
8790 }
8791 }
8792 </style>
8793</head>
8794<body>
8795 <div id="app">
8796 <div class="logo">
8797 <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 75 75">
8798 <!-- Define the drop shadow filter -->
8799 <defs>
8800 <filter id="text-shadow" x="-20%" y="-20%" width="140%" height="140%">
8801 <feGaussianBlur in="SourceAlpha" stdDeviation="2" result="blur"/>
8802 <feOffset in="blur" dx="2" dy="2" result="offsetBlur"/>
8803 <feMerge>
8804 <feMergeNode in="offsetBlur"/>
8805 <feMergeNode in="SourceGraphic"/>
8806 </feMerge>
8807 </filter>
8808 </defs>
8809 <text x="50%" y="70%" dominant-baseline="middle" text-anchor="middle" font-family="Roboto" font-size="100" fill="#068AC7" filter="url(#text-shadow)">Ω</text>
8810 </svg>
8811 </div>
8812 <h1>OMEGA Metadata Dashboard</h1>
8813 <div class="input-social-container">
8814 <!--<input type="text" v-model="filterKey" placeholder="Filter by hotkey...">-->
8815 <br /><br />
8816 <div class="social-icons">
8817 <a href="https://twitter.com/omegalabsai" target="_blank" class="social-icon"><button class="" type="button"><span class=""><img src="https://omegatron.ai/static/images/16b3234e15bf0aece98c.png"></span></button></a>
8818 <a href="https://github.com/omegalabsinc" target="_blank" class="social-icon"><button class="" type="button"><span class=""><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" fill="none"><path fill="#fff" d="M12 2.247a10 10 0 0 0-3.162 19.487c.5.088.687-.212.687-.475 0-.237-.012-1.025-.012-1.862-2.513.462-3.163-.613-3.363-1.175a3.64 3.64 0 0 0-1.025-1.413c-.35-.187-.85-.65-.012-.662a2 2 0 0 1 1.537 1.025 2.137 2.137 0 0 0 2.913.825c.043-.509.27-.984.637-1.338-2.225-.25-4.55-1.112-4.55-4.937a3.9 3.9 0 0 1 1.025-2.688 3.6 3.6 0 0 1 .1-2.65s.837-.262 2.75 1.025a9.43 9.43 0 0 1 5 0c1.912-1.3 2.75-1.025 2.75-1.025.37.838.406 1.786.1 2.65a3.87 3.87 0 0 1 1.025 2.688c0 3.837-2.337 4.687-4.562 4.937a2.37 2.37 0 0 1 .675 1.85c0 1.338-.013 2.413-.013 2.75 0 .263.188.575.688.475A10.005 10.005 0 0 0 12 2.247"></path></svg></span></button></a>
8819 </div>
8820 </div>
8821 <table>
8822 <thead>
8823 <tr>
8824 <th class="sortable" @click="sortBy('video_id')">Video ID<span v-if="sortKey === 'video_id'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8825 <th class="sortable" @click="sortBy('youtube_id')">YouTube ID<span v-if="sortKey === 'youtube_id'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8826 <th class="sortable" @click="sortBy('start_time')">Start<span v-if="sortKey === 'start_time'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8827 <th class="sortable" @click="sortBy('end_time')">End<span v-if="sortKey === 'end_time'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8828 <th class="sortable width520" @click="sortBy('description')">Description<span v-if="sortKey === 'description'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8829 <th class="sortable" @click="sortBy(5)">Desc Rel<span v-if="sortKey === 5" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8830 <th class="sortable" @click="sortBy(6)">Query Rel<span v-if="sortKey === 6" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8831 <th class="sortable" @click="sortBy('query')">Query<span v-if="sortKey === 'query'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8832 <th class="sortable" @click="sortBy(8)">Submitted<span v-if="sortKey === 8" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8833 <th class="width20"><span class="refresh-icon" @click="fetchData">↻</span></th>
8834 </tr>
8835 </thead>
8836 <tbody>
8837 <template v-for="(video, index) in filteredVideos" :key="video.video_id">
8838 <tr>
8839 <td @click="toggleRow(index)">{{ video[0] }}</td>
8840 <td @click="toggleRow(index)">{{ video[1] }}</td>
8841 <td @click="toggleRow(index)">{{ video[2] }}</td>
8842 <td @click="toggleRow(index)">{{ video[3] }}</td>
8843 <td class="width520" @click="toggleRow(index)">{{ video[4] }}</td>
8844 <td @click="toggleRow(index)">{{ video[5] }}</td>
8845 <td @click="toggleRow(index)">{{ video[6] }}</td>
8846 <td @click="toggleRow(index)">{{ video[7] }}</td>
8847 <td @click="toggleRow(index)">{{ video[8] }}</td>
8848 <td class="width20"></td>
8849 </tr>
8850 <tr v-if="expandedRow === index" :key="'expanded-' + video.video_id">
8851 <td colspan="10">
8852 <iframe class="youtube-embed" :src="getYoutubeEmbedUrl(video[1], video[2], video[3])" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>
8853 </td>
8854 </tr>
8855 </template>
8856 </tbody>
8857 </table>
8858 <div class="pagination">
8859 <button @click="prevPage" :disabled="currentPage === 1">Previous</button>
8860 <span>Page {{ currentPage }} of {{ totalPages }}</span>
8861 <button @click="nextPage" :disabled="currentPage === totalPages">Next</button>
8862 </div>
8863 </div>
8864 <div> </div>
8865
8866 <script>
8867 new Vue({
8868 el: '#app',
8869 data: {
8870 videos: [],
8871 filterKey: '',
8872 sortKey: 'submitted_at',
8873 sortOrder: "desc",
8874 expandedRow: null,
8875 currentPage: 1,
8876 itemsPerPage: 1000,
8877 totalItems: 0
8878 },
8879 computed: {
8880 filteredVideos() {
8881 //return this.videos;
8882 let sortedVideos = [...this.videos].sort((a, b) => {
8883 let modifier = this.sortOrder;
8884 let aValue = a[this.sortKey];
8885 let bValue = b[this.sortKey];
8886
8887 // Convert to lowercase if sorting by string
8888 if (typeof aValue === 'string') {
8889 aValue = aValue.toLowerCase();
8890 bValue = bValue.toLowerCase();
8891 }
8892
8893 if (aValue < bValue) return -1 * modifier;
8894 if (aValue > bValue) return 1 * modifier;
8895 return 0;
8896 });
8897
8898 return sortedVideos.filter(video => {
8899 return video[0].toLowerCase().includes(this.filterKey.toLowerCase());
8900 });
8901 },
8902 totalPages() {
8903 return Math.ceil(this.totalItems / this.itemsPerPage);
8904 }
8905 },
8906 methods: {
8907 fetchData() {
8908 axios.get('/dashboard/get-video-metadata', {
8909 params: {
8910 sort_by: this.sortKey,
8911 sort_order: this.sortOrder,
8912 page: this.currentPage,
8913 items_per_page: this.itemsPerPage
8914 }
8915 })
8916 .then(response => {
8917 this.videos = response.data.data;
8918 this.totalItems = response.data.total_items;
8919 })
8920 .catch(error => {
8921 console.error('There was an error fetching the video metadata:', error);
8922 });
8923 },
8924 sortBy(key) {
8925 if (this.sortKey === key) {
8926 this.sortOrder *= "";
8927 } else {
8928 this.sortKey = key;
8929 this.sortOrder = "desc";
8930 }
8931 },
8932 toggleRow(index) {
8933 if (this.expandedRow === index) {
8934 this.expandedRow = null;
8935 } else {
8936 this.expandedRow = index;
8937 }
8938 },
8939 getYoutubeEmbedUrl(youtubeId, startTime, endTime) {
8940 return `https://www.youtube.com/embed/${youtubeId}?start=${startTime}&end=${endTime}&autoplay=1`;
8941 },
8942 prevPage() {
8943 if (this.currentPage > 1) {
8944 this.currentPage--;
8945 this.fetchData();
8946 }
8947 },
8948 nextPage() {
8949 if (this.currentPage < this.totalPages) {
8950 this.currentPage++;
8951 this.fetchData();
8952 }
8953 }
8954 },
8955 mounted() {
8956 this.fetchData();
8957 }
8958 });
8959 </script>
8960</body>
8961</html>
8962
8963
8964---
8965File: /validator-api/static/leaderboard.html
8966---
8967
8968<!DOCTYPE html>
8969<html lang="en">
8970<head>
8971 <meta charset="UTF-8">
8972 <meta name="viewport" content="width=device-width, initial-scale=1.0">
8973 <link rel="icon" href="static/favicon.ico" />
8974 <title>OMEGA Leaderboard</title>
8975 <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/vue.min.js"></script>
8976 <script src="https://cdn.jsdelivr.net/npm/axios/dist/axios.min.js"></script>
8977 <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
8978 <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/chartjs-adapter-date-fns.bundle.min.js"></script>
8979 <style>
8980 /* Apply base font styles */
8981 html, body {
8982 font-family: Roboto, sans-serif;
8983 line-height: 1.5;
8984 height: 101%; /* Ensure the html and body elements take up the full height of the window */
8985 margin: 0; /* Reset any default margin */
8986 padding: 0; /* Reset any default padding */
8987 }
8988
8989 body {
8990 font-size: 16px;
8991 line-height: 1.6;
8992 font-weight: 400;
8993 background-color: #0a1128;
8994 background-image:
8995 linear-gradient(
8996 to bottom,
8997 rgba(255, 255, 255, 0) 0%, /* Fully transparent */
8998 rgba(255, 255, 255, 0) calc(100% - 700px), /* Transparent until 200px from the bottom */
8999 #0a1128 calc(100% - 200px),
9000 #0a1128 100% /* Transition to the background color over the last 200px */
9001 ),
9002 url(https://omegatron.ai/static/images/0423e77f5905b1f1bccb.png);
9003 background-size: cover;
9004 background-repeat: no-repeat;
9005 background-position: center;
9006 color: #ffffff; /* Light text color for better readability */
9007 }
9008 /*
9009 body::before {
9010 position: absolute;
9011 content: "";
9012 width: 100%;
9013 height: 100%;
9014 top: 0;
9015 left: 0;
9016 background-image: linear-gradient(to bottom, #0a1128 0%, rgba(10, 17, 40, 0.8078431373) 30%, rgba(10, 17, 40, 0.5607843137) 60%, rgba(10, 17, 40, 0.1450980392) 95%) !important;
9017 z-index: 1;
9018 }*/
9019 .logo {
9020 display: block; /* Use block to apply margin auto for centering */
9021 width: 75px; /* Set the width of the logo container */
9022 height: 75px; /* Set the height of the logo container */
9023 margin: 0 auto; /* Center the logo horizontally */
9024 margin-top: 2rem; /* Add space above the logo */
9025 }
9026
9027 .logo svg {
9028 width: 100%; /* Make the SVG fill the container */
9029 height: 100%; /* Make the SVG fill the container */
9030 }
9031
9032 h1 {
9033 text-align: center;
9034 font-size: 2.5rem;
9035 margin-bottom: 3rem;
9036 margin-top: 0;
9037 text-shadow: 3px 3px 4px rgba(0, 0, 0, 0.75);
9038 }
9039
9040 /* Table styles */
9041 table {
9042 width: 90%;
9043 margin: 0 auto; /* Center table horizontally */
9044 border-collapse: collapse;
9045 text-indent: 0;
9046 color: #ffffff; /* Ensure table text is light-colored */
9047 border-radius: 10px; /* Rounded corners */
9048 box-shadow: 4px 4px 8px 0 rgba(70, 70, 70, 0.3); /* Drop shadow */
9049 }
9050
9051 /* not first child */
9052 tr:not(:first-child) {
9053 cursor: pointer;
9054 transition: background-color 0.3s ease, box-shadow 0.3s ease;
9055 }
9056
9057 tr:not(:first-child):hover {
9058 background-color: rgba(98, 30, 100, 0.4); /* More translucent background color */
9059 box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
9060 }
9061
9062 tr.graph-row:hover {
9063 background-color: #1f2a48; /* Dark background for headers */
9064 }
9065
9066 th.center {
9067 text-align: center;
9068 }
9069
9070 .width520 {
9071 width: 520px;
9072 }
9073
9074 .width20 {
9075 width: 20px;
9076 }
9077
9078 /* Style for table headers and cells to inherit the rounded corners */
9079 th, td {
9080 /*border: 1px solid #ddd; Light gray border for cells */
9081 padding: 8px; /* Padding for cell content */
9082 text-align: left;
9083 width: 10%;
9084 }
9085
9086 th {
9087 background-color: #272727; /* Dark background for headers */
9088 color: #ffffff; /* Light text color for headers */
9089 font-weight: bold; /* Bold font weight for better readability */
9090 }
9091
9092 /* Style for the first and last cells in each row to inherit the rounded corners */
9093 th:first-child {
9094 border-top-left-radius: 10px; /* Top-left rounded corner */
9095 }
9096
9097 th:last-child {
9098 border-top-right-radius: 10px; /* Top-right rounded corner */
9099 }
9100
9101 /* Style for the last row to inherit the rounded corners */
9102 tr:last-child td:first-child {
9103 border-bottom-left-radius: 10px; /* Bottom-left rounded corner */
9104 }
9105
9106 tr:last-child td:last-child {
9107 border-bottom-right-radius: 10px; /* Bottom-right rounded corner */
9108 }
9109
9110 /* Body styles */
9111 tbody tr:nth-child(odd) {
9112 background-color: #162035; /* Dark background for odd rows */
9113 }
9114
9115 tbody tr:nth-child(even) {
9116 background-color: #1f2a48; /* Slightly different dark background for even rows */
9117 }
9118
9119 /* Footer styles */
9120 tfoot {
9121 font-weight: bold;
9122 background-color: #1f2a48; /* Consistent background for footer */
9123 }
9124
9125 .refresh-icon {
9126 cursor: pointer;
9127 }
9128
9129 .sortable {
9130 cursor: pointer;
9131 }
9132
9133 .arrow {
9134 display: inline-block;
9135 margin-left: 5px;
9136 }
9137
9138 .arrow-up::before {
9139 content: '▲';
9140 }
9141
9142 .arrow-down::before {
9143 content: '▼';
9144 }
9145
9146 input[type="text"] {
9147 width: 30%; /* Match the table width or adjust as needed */
9148 padding: 10px; /* Larger padding for a taller input field */
9149 margin-bottom: 20px; /* Space between the input field and the table */
9150 font-size: 16px; /* Larger font size for better readability */
9151 border: 1px solid #ccc; /* Subtle border color */
9152 border-radius: 5px; /* Slightly rounded corners */
9153 box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.1); /* Inner shadow for depth */
9154 display: block; /* Ensure it's a block-level element */
9155 margin-left: 11%; /* Combined with margin-right: auto, centers the input */
9156 margin-right: auto;
9157 }
9158
9159 .input-social-container {
9160 display: flex;
9161 align-items: center;
9162 }
9163
9164 .info-text {
9165 margin-left: 9em;
9166 margin-right: 0em;
9167 }
9168
9169 .social-icons {
9170 position: absolute;
9171 right: 5%;
9172 display: flex;
9173 align-items: center;
9174 }
9175
9176 .social-icons button {
9177 background: none;
9178 border: none;
9179 cursor: pointer;
9180 }
9181
9182 .social-icon {
9183 display: flex;
9184 justify-content: center;
9185 align-items: center;
9186 width: 50px; /* Adjust size as needed */
9187 height: 50px; /* Adjust size as needed */
9188 border-radius: 50%; /* Make it circular */
9189 border: 1px solid #ccc; /* Light gray border */
9190 margin-left: 15px; /* Space between icons */
9191 overflow: hidden; /* Ensure the content fits the circular shape */
9192 margin-bottom: 2em;
9193 }
9194
9195 .social-icon img,
9196 .social-icon svg {
9197 width: 100%;
9198 height: 100%;
9199 display: block;
9200 object-fit: cover; /* Ensure the image covers the area */
9201 }
9202
9203 /* Responsive styles for smaller screens */
9204 @media (max-width: 768px) {
9205 body {
9206 font-size: 0.9em; /* Smaller font size on mobile */
9207 }
9208
9209 h1 {
9210 font-size: 1.5rem; /* Adjust heading size for mobile */
9211 }
9212
9213 .logo {
9214 width: 30%; /* Increase width percentage for smaller screens */
9215 }
9216
9217 input[type="text"] {
9218 width: 80%; /* Increase width for mobile */
9219 padding: 8px; /* Adjust padding */
9220 font-size: 1em; /* Adjust font size */
9221 }
9222
9223 table {
9224 width: 100%; /* Full width on mobile */
9225 }
9226 }
9227
9228 .chart-card {
9229 background-color: rgba(255, 255, 255, 0.05);
9230 border: 1px solid rgba(255, 255, 255, 0.1);
9231 border-radius: 10px;
9232 box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1), 0 1px 3px rgba(0, 0, 0, 0.08);
9233 padding: 20px;
9234 margin: 20px auto;
9235 max-width: 88%;
9236 transition: transform 0.3s ease-in-out, box-shadow 0.3s ease-in-out;
9237 margin-bottom: 40px;
9238 }
9239
9240 .chart-card:hover {
9241 transform: translateY(-5px);
9242 box-shadow: 0 7px 14px rgba(0, 0, 0, 0.15), 0 3px 6px rgba(0, 0, 0, 0.1);
9243 }
9244
9245 .chart-title {
9246 color: #ffffff;
9247 font-size: 1.5rem;
9248 margin-bottom: 15px;
9249 text-align: center;
9250 }
9251
9252 .chart-container {
9253 position: relative;
9254 height: 40vh;
9255 width: 100%;
9256 }
9257
9258 /* Responsive adjustments */
9259 @media (max-width: 768px) {
9260 .chart-card {
9261 padding: 15px;
9262 margin: 15px auto;
9263 }
9264
9265 .chart-title {
9266 font-size: 1.2rem;
9267 }
9268
9269 .chart-container {
9270 height: 50vh;
9271 }
9272 }
9273
9274 .miner-chart-container {
9275 height: 300px;
9276 margin-top: 20px;
9277 margin-bottom: 20px;
9278 }
9279 .expanded-row {
9280 background-color: rgba(255, 255, 255, 0.05);
9281 transition: all 0.3s ease;
9282 }
9283 .expanded-content {
9284 padding: 20px;
9285 }
9286
9287 .focus-metrics {
9288 display: flex;
9289 justify-content: space-around;
9290 flex-wrap: wrap;
9291 margin-bottom: 20px;
9292 }
9293
9294 .focus-metric {
9295 background-color: rgba(255, 255, 255, 0.1);
9296 border-radius: 8px;
9297 padding: 15px;
9298 text-align: center;
9299 flex: 1;
9300 margin: 10px;
9301 min-width: 200px;
9302 backdrop-filter: blur(10px);
9303 -webkit-backdrop-filter: blur(10px);
9304 }
9305
9306 .focus-metric-title {
9307 font-size: 1.2rem;
9308 margin-bottom: 10px;
9309 color: #c0c0c0;
9310 font-weight: bold;
9311 }
9312
9313 .focus-metric-value {
9314 font-size: 1.5rem;
9315 font-weight: bold;
9316 }
9317
9318 .focus-metric-value-usd {
9319 font-size: 1.0rem;
9320 color: #aaaaaa;
9321 }
9322 </style>
9323</head>
9324<body>
9325 <div id="app">
9326 <div class="logo">
9327 <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 75 75">
9328 <!-- Define the drop shadow filter -->
9329 <defs>
9330 <filter id="text-shadow" x="-20%" y="-20%" width="140%" height="140%">
9331 <feGaussianBlur in="SourceAlpha" stdDeviation="2" result="blur"/>
9332 <feOffset in="blur" dx="2" dy="2" result="offsetBlur"/>
9333 <feMerge>
9334 <feMergeNode in="offsetBlur"/>
9335 <feMergeNode in="SourceGraphic"/>
9336 </feMerge>
9337 </filter>
9338 </defs>
9339 <text x="50%" y="70%" dominant-baseline="middle" text-anchor="middle" font-family="Roboto" font-size="100" fill="#068AC7" filter="url(#text-shadow)">Ω</text>
9340 </svg>
9341 </div>
9342 <h1>OMEGA Leaderboard</h1>
9343 <div v-if="focusVideoData" class="chart-card">
9344 <h2 class="chart-title">Ω Focus KPIs</h2>
9345 <div class="focus-metrics">
9346 <div class="focus-metric">
9347 <div class="focus-metric-title">TOTAL WALLETS</div>
9348 <div class="focus-metric-value">{{ totalWallets }}</div>
9349 </div>
9350 <div class="focus-metric">
9351 <div class="focus-metric-title">TOTAL TASKS DONE</div>
9352 <div class="focus-metric-value">{{ totalVideosPurchased }}</div>
9353 </div>
9354 <div class="focus-metric">
9355 <div class="focus-metric-title">TOTAL TAO BALANCE</div>
9356 <div class="focus-metric-value">{{ totalTaoBalance.toFixed(3) }} <span class="focus-metric-value-usd">(${{ totalTaoBalanceUSD }} USD)</span></div>
9357 </div>
9358 <div class="focus-metric">
9359 <div class="focus-metric-title">TOTAL TAO REVENUE</div>
9360 <div class="focus-metric-value">{{ totalTaoRevenue.toFixed(3) }} <span class="focus-metric-value-usd">(${{ totalTaoRevenueUSD }} USD)</span></div>
9361 </div>
9362 </div>
9363 <div class="chart-container">
9364 <canvas ref="focusVideoChart"></canvas>
9365 </div>
9366 </div>
9367
9368 <div v-if="datasetSizeData" class="chart-card">
9369 <h2 class="chart-title">OMEGA Multimodal Dataset Size Over Time</h2>
9370 <div class="focus-metrics">
9371 <div class="focus-metric">
9372 <div class="focus-metric-title">TOTAL ROWS</div>
9373 <div class="focus-metric-value">{{ totalRows.toLocaleString() }}</div>
9374 </div>
9375 <div class="focus-metric">
9376 <div class="focus-metric-title">MEMORY SIZE (GB)</div>
9377 <div class="focus-metric-value">{{ memory_size_gb.toFixed(2) }}</div>
9378 </div>
9379 </div>
9380 <div class="chart-container">
9381 <canvas ref="datasetSizeChart"></canvas>
9382 </div>
9383 </div>
9384
9385 <div class="input-social-container">
9386 <p class="info-text">Click on a row to display miner performance graph.</p>
9387 <input type="text" v-model="filterKey" placeholder="Filter by hotkey...">
9388 <div class="social-icons">
9389 <a href="https://twitter.com/omegalabsai" target="_blank" class="social-icon"><button class="" type="button"><span class=""><img src="https://omegatron.ai/static/images/16b3234e15bf0aece98c.png"></span></button></a>
9390 <a href="https://github.com/omegalabsinc" target="_blank" class="social-icon"><button class="" type="button"><span class=""><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" fill="none"><path fill="#fff" d="M12 2.247a10 10 0 0 0-3.162 19.487c.5.088.687-.212.687-.475 0-.237-.012-1.025-.012-1.862-2.513.462-3.163-.613-3.363-1.175a3.64 3.64 0 0 0-1.025-1.413c-.35-.187-.85-.65-.012-.662a2 2 0 0 1 1.537 1.025 2.137 2.137 0 0 0 2.913.825c.043-.509.27-.984.637-1.338-2.225-.25-4.55-1.112-4.55-4.937a3.9 3.9 0 0 1 1.025-2.688 3.6 3.6 0 0 1 .1-2.65s.837-.262 2.75 1.025a9.43 9.43 0 0 1 5 0c1.912-1.3 2.75-1.025 2.75-1.025.37.838.406 1.786.1 2.65a3.87 3.87 0 0 1 1.025 2.688c0 3.837-2.337 4.687-4.562 4.937a2.37 2.37 0 0 1 .675 1.85c0 1.338-.013 2.413-.013 2.75 0 .263.188.575.688.475A10.005 10.005 0 0 0 12 2.247"></path></svg></span></button></a>
9391 </div>
9392 </div>
9393 <table>
9394 <tr>
9395 <th class="center width520">Hotkey</th>
9396 <th>Project</th>
9397 <th class="sortable" @click="sortBy('datapoints')">Datapoints<span v-if="sortKey === 'datapoints'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9398 <th class="sortable" @click="sortBy('avg_desc_relevance')">Avg Desc Relevance<span v-if="sortKey === 'avg_desc_relevance'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9399 <th class="sortable" @click="sortBy('avg_query_relevance')">Avg Query Relevance<span v-if="sortKey === 'avg_query_relevance'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9400 <th class="sortable" @click="sortBy('avg_score')">Avg Score<span v-if="sortKey === 'avg_score'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9401 <th>Last Updated</th>
9402 <th class="width20"><span class="refresh-icon" @click="fetchData">↻</span></th>
9403 </tr>
9404 <template v-for="miner in filteredMiners">
9405 <tr :key="miner.hotkey" @click="toggleExpand(miner)" :class="{ 'expanded-row': miner.expanded }">
9406 <td class="width520">{{ miner.hotkey }}</td>
9407 <td>{{ miner.is_bittensor ? 'Bittensor' : 'Commune' }}</td>
9408 <td>{{ miner.datapoints }}</td>
9409 <td>{{ miner.avg_desc_relevance }}</td>
9410 <td>{{ miner.avg_query_relevance }}</td>
9411 <td>{{ miner.avg_score }}</td>
9412 <td>{{ miner.last_updated }}</td>
9413 <td class="width20"></td>
9414 </tr>
9415 <tr v-if="miner.expanded" :key="miner.hotkey + '-expanded'" class="graph-row">
9416 <td colspan="8">
9417 <div class="expanded-content">
9418 <div class="miner-chart-container">
9419 <canvas :ref="'minerChart-' + miner.hotkey"></canvas>
9420 </div>
9421 </div>
9422 </td>
9423 </tr>
9424 </template>
9425 </table>
9426 </div>
9427 <div> </div>
9428
9429 <script>
9430 new Vue({
9431 el: '#app',
9432 data: {
9433 miners: [],
9434 filterKey: '',
9435 sortKey: '',
9436 sortOrder: 1,
9437 datasetSizeData: null,
9438 chartCreated: false,
9439 minerCharts: {},
9440 focusVideoData: null,
9441 totalWallets: 0,
9442 totalVideosPurchased: 0,
9443 totalTaoBalance: 0.0,
9444 totalTaoRevenue: 0.0,
9445 focusVideoChartCreated: false,
9446 taoPrice: 0,
9447 },
9448 computed: {
9449 filteredMiners() {
9450 return this.miners.filter(miner => {
9451 return miner.hotkey.toLowerCase().includes(this.filterKey.toLowerCase());
9452 }).sort((a, b) => {
9453 let modifier = this.sortOrder;
9454 if(a[this.sortKey] < b[this.sortKey]) return -1 * modifier;
9455 if(a[this.sortKey] > b[this.sortKey]) return 1 * modifier;
9456 return 0;
9457 });
9458 },
9459 totalTaoBalanceUSD() {
9460 return (this.totalTaoBalance * this.taoPrice).toFixed(2);
9461 },
9462 totalTaoRevenueUSD() {
9463 return (this.totalTaoRevenue * this.taoPrice).toFixed(2);
9464 }
9465 },
9466 watch: {
9467 datasetSizeData: {
9468 handler(newData) {
9469 console.log('Dataset size data updated:', newData);
9470 this.$nextTick(() => {
9471 this.createOrUpdateChart();
9472 });
9473 },
9474 deep: true
9475 }
9476 },
9477 methods: {
9478 fetchData() {
9479 axios.get('/api/leaderboard')
9480 .then(response => {
9481 this.miners = response.data;
9482 })
9483 .catch(error => {
9484 console.error('There was an error fetching the leaderboard data:', error);
9485 });
9486 },
9487 sortBy(key) {
9488 if (this.sortKey === key) {
9489 this.sortOrder *= -1;
9490 } else {
9491 this.sortKey = key;
9492 this.sortOrder = 1;
9493 }
9494 },
9495 fetchDatasetSizeData() {
9496 console.log('Fetching dataset size data...');
9497 axios.get('/api/leaderboard-dataset-data')
9498 .then(response => {
9499 console.log('Dataset size data received:', response.data);
9500 const data = response.data;
9501
9502 // Process the data to extract only what we need
9503 this.datasetSizeData = {
9504 labels: data.map(item => item.snapshot_date),
9505 datasets: [
9506 {
9507 label: 'TOTAL ROWS',
9508 borderColor: '#98C379', // Green
9509 backgroundColor: '#98C379',
9510 data: data.map(item => ({
9511 x: item.snapshot_date,
9512 y: item.total_rows
9513 })),
9514 yAxisID: 'y'
9515 },
9516 {
9517 label: 'MEMORY SIZE (GB)',
9518 borderColor: '#61AFEF', // Blue
9519 backgroundColor: '#61AFEF',
9520 data: data.map(item => ({
9521 x: item.snapshot_date,
9522 y: item.memory_size_gb
9523 })),
9524 yAxisID: 'y1'
9525 }
9526 ]
9527 };
9528
9529 this.totalRows = data[data.length - 1].total_rows;
9530 this.memory_size_gb = data[data.length - 1].memory_size_gb;
9531
9532 this.$nextTick(() => {
9533 this.createOrUpdateChart();
9534 });
9535 })
9536 .catch(error => {
9537 console.error('There was an error fetching the dataset size data:', error);
9538 });
9539 },
9540 createOrUpdateChart() {
9541 const canvas = this.$refs.datasetSizeChart;
9542 if (canvas && this.datasetSizeData && !this.chartCreated) {
9543 const ctx = canvas.getContext('2d');
9544
9545 this.chart = new Chart(ctx, {
9546 type: 'line',
9547 data: {
9548 labels: this.datasetSizeData.labels,
9549 datasets: this.datasetSizeData.datasets,
9550 },
9551 options: {
9552 responsive: true,
9553 maintainAspectRatio: false,
9554 scales: {
9555 x: {
9556 type: 'time',
9557 time: {
9558 unit: 'month',
9559 displayFormats: {
9560 day: 'MMM yyyy'
9561 }
9562 },
9563 title: {
9564 display: true,
9565 text: 'DATE (UTC)',
9566 color: '#E0E0E0',
9567 font: {
9568 size: 14,
9569 weight: 'bold'
9570 }
9571 },
9572 ticks: {
9573 color: '#E0E0E0',
9574 font: {
9575 size: 12
9576 },
9577 source: 'data',
9578 maxRotation: 0,
9579 autoSkip: true,
9580 maxTicksLimit: 12
9581 }
9582 },
9583 y: {
9584 type: 'linear',
9585 display: true,
9586 position: 'left',
9587 title: {
9588 display: true,
9589 text: 'TOTAL ROWS',
9590 color: '#E0E0E0',
9591 font: {
9592 size: 14,
9593 weight: 'bold'
9594 }
9595 },
9596 ticks: {
9597 color: '#E0E0E0',
9598 font: {
9599 size: 12
9600 },
9601 callback: function(value, index, values) {
9602 return value.toLocaleString();
9603 }
9604 }
9605 },
9606 y1: {
9607 type: 'linear',
9608 display: true,
9609 position: 'right',
9610 title: {
9611 display: true,
9612 text: 'MEMORY SIZE (GB)',
9613 color: '#E0E0E0',
9614 font: {
9615 size: 14,
9616 weight: 'bold'
9617 }
9618 },
9619 ticks: {
9620 color: '#E0E0E0',
9621 font: {
9622 size: 12
9623 },
9624 callback: function(value, index, values) {
9625 return value.toLocaleString();
9626 }
9627 },
9628 grid: {
9629 drawOnChartArea: false
9630 }
9631 },
9632 },
9633 plugins: {
9634 legend: {
9635 labels: {
9636 color: '#E0E0E0',
9637 font: {
9638 size: 12
9639 }
9640 }
9641 },
9642 tooltip: {
9643 callbacks: {
9644 title: function(tooltipItems) {
9645 return tooltipItems[0].label + ' (UTC)';
9646 },
9647 label: function(context) {
9648 let label = context.dataset.label || '';
9649 if (label) {
9650 label += ': ';
9651 }
9652 if (context.parsed.y !== null) {
9653 label += context.parsed.y.toLocaleString();
9654 }
9655 return label;
9656 }
9657 }
9658 }
9659 }
9660 }
9661 });
9662 this.chartCreated = true;
9663 } else if (this.chart && this.datasetSizeData) {
9664 this.chart.data = this.datasetSizeData;
9665 this.chart.update();
9666 } else {
9667 console.log('Unable to create or update chart. Canvas:', !!canvas, 'Data:', !!this.datasetSizeData, 'Chart created:', this.chartCreated);
9668 }
9669 },
9670 toggleExpand(miner) {
9671 this.$set(miner, 'expanded', !miner.expanded);
9672 if (miner.expanded) {
9673 this.fetchMinerData(miner.hotkey);
9674 }
9675 },
9676 fetchMinerData(hotkey) {
9677 axios.get(`/api/leaderboard-miner-data?hotkey=${hotkey}`)
9678 .then(response => {
9679 this.createMinerChart(hotkey, response.data);
9680 })
9681 .catch(error => {
9682 console.error('Error fetching miner data:', error);
9683 });
9684 },
9685 createMinerChart(hotkey, data) {
9686 const canvas = this.$refs[`minerChart-${hotkey}`][0];
9687 const ctx = canvas.getContext('2d');
9688
9689 if (this.minerCharts[hotkey]) {
9690 this.minerCharts[hotkey].destroy();
9691 }
9692
9693 this.minerCharts[hotkey] = new Chart(ctx, {
9694 type: 'line',
9695 data: {
9696 labels: data.map(item => item.snapshot_date),
9697 datasets: [
9698 {
9699 label: 'Datapoints',
9700 borderColor: '#98C379', // Green
9701 backgroundColor: '#98C379',
9702 data: data.map(item => ({x: item.snapshot_date, y: item.datapoints})),
9703 yAxisID: 'y'
9704 },
9705 {
9706 label: 'Avg Score',
9707 borderColor: '#61AFEF', // Blue
9708 backgroundColor: '#61AFEF',
9709 data: data.map(item => ({x: item.snapshot_date, y: item.avg_score})),
9710 yAxisID: 'y1'
9711 },
9712 {
9713 label: 'Avg Query Relevance',
9714 borderColor: '#D19A66',
9715 backgroundColor: '#D19A66',
9716 data: data.map(item => ({x: item.snapshot_date, y: item.avg_query_relevance})),
9717 yAxisID: 'y1'
9718 },
9719 {
9720 label: 'Avg Desc Relevance',
9721 borderColor: '#C678DD',
9722 backgroundColor: '#C678DD',
9723 data: data.map(item => ({x: item.snapshot_date, y: item.avg_desc_relevance})),
9724 yAxisID: 'y1'
9725 },
9726 {
9727 label: 'Incentive',
9728 borderColor: '#E06C75',
9729 backgroundColor: '#E06C75',
9730 data: data.map(item => ({x: item.snapshot_date, y: item.incentive})),
9731 yAxisID: 'y2'
9732 }
9733 ]
9734 },
9735 options: {
9736 responsive: true,
9737 maintainAspectRatio: false,
9738 scales: {
9739 x: {
9740 type: 'time',
9741 time: {
9742 parser: 'yyyy-MM-dd',
9743 unit: 'day',
9744 displayFormats: {
9745 day: 'MMM d, yyyy'
9746 }
9747 },
9748 title: {
9749 display: true,
9750 text: 'DATE (UTC)',
9751 color: '#E0E0E0',
9752 font: {
9753 size: 14,
9754 weight: 'bold'
9755 }
9756 },
9757 ticks: {
9758 color: '#E0E0E0',
9759 font: {
9760 size: 12
9761 }
9762 }
9763 },
9764 y: {
9765 type: 'linear',
9766 display: true,
9767 position: 'left',
9768 title: {
9769 display: true,
9770 text: 'DATAPOINTS',
9771 color: '#E0E0E0',
9772 font: {
9773 size: 14,
9774 weight: 'bold'
9775 }
9776 },
9777 ticks: {
9778 color: '#E0E0E0',
9779 font: {
9780 size: 12
9781 },
9782 callback: function(value) {
9783 return value.toLocaleString();
9784 }
9785 }
9786 },
9787 y1: {
9788 type: 'linear',
9789 display: true,
9790 position: 'right',
9791 min: 0,
9792 max: 1,
9793 title: {
9794 display: true,
9795 text: 'SCORES',
9796 color: '#E0E0E0',
9797 font: {
9798 size: 14,
9799 weight: 'bold'
9800 }
9801 },
9802 ticks: {
9803 color: '#E0E0E0',
9804 font: {
9805 size: 12
9806 }
9807 },
9808 grid: {
9809 drawOnChartArea: false
9810 }
9811 },
9812 y2: {
9813 type: 'linear',
9814 display: false,
9815 position: 'right',
9816 grid: {
9817 drawOnChartArea: false,
9818 },
9819 ticks: {
9820 callback: function(value, index, values) {
9821 return value.toFixed(5);
9822 }
9823 }
9824 },
9825 },
9826 plugins: {
9827 legend: {
9828 labels: {
9829 color: '#E0E0E0',
9830 font: {
9831 size: 12
9832 }
9833 }
9834 },
9835 tooltip: {
9836 callbacks: {
9837 title: function(tooltipItems) {
9838 return tooltipItems[0].label + ' (UTC)';
9839 },
9840 label: function(context) {
9841 let label = context.dataset.label || '';
9842 if (label) {
9843 label += ': ';
9844 }
9845 if (context.parsed.y !== null) {
9846 if (label.startsWith('Incentive')) {
9847 // Display incentive with 5 decimal places
9848 label += context.parsed.y.toFixed(5);
9849 } else {
9850 // For other metrics, use the existing formatting
9851 label += context.parsed.y.toLocaleString();
9852 }
9853 }
9854 return label;
9855 }
9856 }
9857 }
9858 }
9859 }
9860 });
9861
9862 },
9863 fetchFocusVideoData() {
9864 console.log('Fetching focus video data...');
9865 axios.get('/api/leaderboard-focus-data')
9866 .then(response => {
9867 console.log('Focus video data received:', response.data);
9868 const data = response.data;
9869
9870 // Process the data
9871 this.focusVideoData = {
9872 labels: data.map(item => item.snapshot_date),
9873 datasets: [
9874 {
9875 label: 'Total Wallets',
9876 borderColor: '#98C379', // Green
9877 backgroundColor: '#98C379',
9878 data: data.map(item => ({
9879 x: item.snapshot_date,
9880 y: item.total_wallets
9881 })),
9882 yAxisID: 'y2'
9883 },
9884 {
9885 label: 'Total Tasks Done',
9886 borderColor: '#61AFEF', // Blue
9887 backgroundColor: '#61AFEF',
9888 data: data.map(item => ({
9889 x: item.snapshot_date,
9890 y: item.total_videos_purchased
9891 })),
9892 yAxisID: 'y'
9893 },
9894 {
9895 label: 'Total TAO Balance',
9896 borderColor: '#C678DD', // Purple
9897 backgroundColor: '#C678DD',
9898 data: data.map(item => ({
9899 x: item.snapshot_date,
9900 y: item.total_tao_balance
9901 })),
9902 yAxisID: 'y1'
9903 },
9904 {
9905 label: 'Total TAO Revenue',
9906 borderColor: '#E06C75', // Red
9907 backgroundColor: '#E06C75',
9908 data: data.map(item => ({
9909 x: item.snapshot_date,
9910 y: item.total_tao_revenue
9911 })),
9912 yAxisID: 'y1'
9913 }
9914 ]
9915 };
9916
9917 this.totalWallets = data[data.length - 1].total_wallets;
9918 this.totalVideosPurchased = data[data.length - 1].total_videos_purchased;
9919 this.totalTaoBalance = data[data.length - 1].total_tao_balance;
9920 this.totalTaoRevenue = data[data.length - 1].total_tao_revenue;
9921
9922 this.$nextTick(() => {
9923 this.createOrUpdateFocusVideoChart();
9924 });
9925 })
9926 .catch(error => {
9927 console.error('There was an error fetching the focus video data:', error);
9928 });
9929 },
9930 createOrUpdateFocusVideoChart() {
9931 const canvas = this.$refs.focusVideoChart;
9932 if (canvas && this.focusVideoData && !this.focusVideoChartCreated) {
9933 const ctx = canvas.getContext('2d');
9934
9935 this.focusVideoChart = new Chart(ctx, {
9936 type: 'line',
9937 data: this.focusVideoData,
9938 options: {
9939 responsive: true,
9940 maintainAspectRatio: false,
9941 scales: {
9942 x: {
9943 type: 'time',
9944 time: {
9945 unit: 'month',
9946 displayFormats: {
9947 month: 'MMM yyyy'
9948 }
9949 },
9950 title: {
9951 display: true,
9952 text: 'DATE',
9953 color: '#E0E0E0',
9954 font: {
9955 size: 14,
9956 weight: 'bold'
9957 }
9958 },
9959 ticks: {
9960 color: '#E0E0E0',
9961 font: {
9962 size: 12
9963 },
9964 maxRotation: 0,
9965 autoSkip: true,
9966 maxTicksLimit: 12
9967 }
9968 },
9969 y2: {
9970 type: 'linear',
9971 display: false, // Hide/Show the y2-axis
9972 position: 'left',
9973 title: {
9974 display: false,
9975 text: 'TOTAL WALLETS'
9976 },
9977 ticks: {
9978 color: '#98C379',
9979 callback: function(value) {
9980 return value >= 1000 ? value / 1000 + 'K' : value;
9981 }
9982 },
9983 grid: {
9984 drawOnChartArea: false
9985 }
9986 },
9987 y: {
9988 type: 'linear',
9989 display: false, // Hide/Show the y-axis
9990 position: 'left',
9991 title: {
9992 display: true,
9993 text: 'COUNT',
9994 color: '#E0E0E0',
9995 font: {
9996 size: 14,
9997 weight: 'bold'
9998 }
9999 },
10000 ticks: {
10001 color: '#E0E0E0',
10002 font: {
10003 size: 12
10004 },
10005 callback: function(value) {
10006 return value.toLocaleString();
10007 }
10008 }
10009 },
10010 y1: {
10011 type: 'linear',
10012 display: true,
10013 position: 'right',
10014 title: {
10015 display: true,
10016 text: 'TAO',
10017 color: '#E0E0E0',
10018 font: {
10019 size: 14,
10020 weight: 'bold'
10021 }
10022 },
10023 ticks: {
10024 color: '#E0E0E0',
10025 font: {
10026 size: 12
10027 },
10028 callback: function(value) {
10029 return value.toFixed(2);
10030 }
10031 },
10032 grid: {
10033 drawOnChartArea: false
10034 }
10035 },
10036 },
10037 plugins: {
10038 legend: {
10039 labels: {
10040 color: '#E0E0E0',
10041 font: {
10042 size: 12
10043 }
10044 }
10045 },
10046 tooltip: {
10047 callbacks: {
10048 title: function(tooltipItems) {
10049 return tooltipItems[0].label.replace(', 12:00:00 a.m.', '');
10050 },
10051 label: function(context) {
10052 let label = context.dataset.label || '';
10053 if (label) {
10054 label += ': ';
10055 }
10056 if (context.parsed.y !== null) {
10057 if (label.includes('TAO')) {
10058 label += context.parsed.y.toFixed(3);
10059 } else {
10060 label += context.parsed.y.toLocaleString();
10061 }
10062 }
10063 return label;
10064 }
10065 }
10066 }
10067 }
10068 }
10069 });
10070 this.focusVideoChartCreated = true;
10071 } else if (this.focusVideoChart && this.focusVideoData) {
10072 this.focusVideoChart.data = this.focusVideoData;
10073 this.focusVideoChart.update();
10074 } else {
10075 console.log('Unable to create or update focus video chart. Canvas:', !!canvas, 'Data:', !!this.focusVideoData, 'Chart created:', this.focusVideoChartCreated);
10076 }
10077 },
10078 fetchTaoPrice() {
10079 axios.get('https://focus-api.omegatron.ai/get_tao_price')
10080 .then(response => {
10081 this.taoPrice = response.data;
10082 })
10083 .catch(error => {
10084 console.error('Error fetching TAO price:', error);
10085 });
10086 },
10087 },
10088 mounted() {
10089 this.fetchData();
10090 this.fetchDatasetSizeData();
10091 this.fetchFocusVideoData();
10092 this.fetchTaoPrice();
10093 }
10094 });
10095 </script>
10096</body>
10097</html>
10098
10099
10100---
10101File: /validator-api/validator_api/communex/_common.py
10102---
10103
10104import random
10105
10106class ComxSettings():
10107 # TODO: improve node lists
10108 NODE_URLS: list[str] = [
10109 "wss://commune-api-node-0.communeai.net",
10110 "wss://commune-api-node-1.communeai.net",
10111 "wss://commune-api-node-2.communeai.net",
10112 "wss://commune-api-node-3.communeai.net",
10113 "wss://commune-api-node-4.communeai.net",
10114 "wss://commune-api-node-5.communeai.net",
10115 "wss://commune-api-node-6.communeai.net",
10116 "wss://commune-api-node-7.communeai.net",
10117 "wss://commune-api-node-8.communeai.net",
10118 "wss://commune-api-node-9.communeai.net",
10119 "wss://commune-api-node-10.communeai.net",
10120 "wss://commune-api-node-11.communeai.net",
10121 "wss://commune-api-node-12.communeai.net",
10122 "wss://commune-api-node-13.communeai.net",
10123 "wss://commune-api-node-14.communeai.net",
10124 "wss://commune-api-node-15.communeai.net",
10125 "wss://commune-api-node-16.communeai.net",
10126 "wss://commune-api-node-17.communeai.net",
10127 "wss://commune-api-node-18.communeai.net",
10128 "wss://commune-api-node-19.communeai.net",
10129 "wss://commune-api-node-20.communeai.net",
10130 "wss://commune-api-node-21.communeai.net",
10131 "wss://commune-api-node-22.communeai.net",
10132 "wss://commune-api-node-23.communeai.net",
10133 "wss://commune-api-node-24.communeai.net",
10134 "wss://commune-api-node-25.communeai.net",
10135 "wss://commune-api-node-26.communeai.net",
10136 "wss://commune-api-node-27.communeai.net",
10137 "wss://commune-api-node-28.communeai.net",
10138 "wss://commune-api-node-29.communeai.net",
10139 "wss://commune-api-node-30.communeai.net",
10140 "wss://commune-api-node-31.communeai.net",
10141 ]
10142 TESTNET_NODE_URLS: list[str] = [
10143 "wss://testnet-commune-api-node-0.communeai.net"]
10144
10145
10146def get_node_url(
10147 comx_settings: ComxSettings | None = None, *, use_testnet: bool = False
10148) -> str:
10149 comx_settings = comx_settings or ComxSettings()
10150 match use_testnet:
10151 case True:
10152 node_url = random.choice(comx_settings.TESTNET_NODE_URLS)
10153 case False:
10154 node_url = random.choice(comx_settings.NODE_URLS)
10155 return node_url
10156
10157
10158---
10159File: /validator-api/validator_api/communex/client.py
10160---
10161
10162import json
10163import queue
10164from concurrent.futures import Future, ThreadPoolExecutor
10165from contextlib import contextmanager
10166from copy import deepcopy
10167from dataclasses import dataclass
10168from typing import Any, Mapping, TypeVar
10169
10170from substrateinterface import ExtrinsicReceipt # type: ignore
10171from substrateinterface import Keypair # type: ignore
10172from substrateinterface import SubstrateInterface # type: ignore
10173from substrateinterface.storage import StorageKey # type: ignore
10174
10175from validator_api.communex.errors import ChainTransactionError, NetworkQueryError
10176from validator_api.communex.types import NetworkParams, Ss58Address, SubnetParams
10177
10178# TODO: InsufficientBalanceError, MismatchedLengthError etc
10179
10180MAX_REQUEST_SIZE = 9_000_000
10181
10182
10183@dataclass
10184class Chunk:
10185 batch_requests: list[tuple[Any, Any]]
10186 prefix_list: list[list[str]]
10187 fun_params: list[tuple[Any, Any, Any, Any, str]]
10188
10189
10190T1 = TypeVar("T1")
10191T2 = TypeVar("T2")
10192
10193
10194class CommuneClient:
10195 """
10196 A client for interacting with Commune network nodes, querying storage,
10197 submitting transactions, etc.
10198
10199 Attributes:
10200 wait_for_finalization: Whether to wait for transaction finalization.
10201
10202 Example:
10203 ```py
10204 client = CommuneClient()
10205 client.query(name='function_name', params=['param1', 'param2'])
10206 ```
10207
10208 Raises:
10209 AssertionError: If the maximum connections value is less than or equal
10210 to zero.
10211 """
10212
10213 wait_for_finalization: bool
10214 _num_connections: int
10215 _connection_queue: queue.Queue[SubstrateInterface]
10216
10217 def __init__(
10218 self,
10219 url: str,
10220 num_connections: int = 1,
10221 wait_for_finalization: bool = False,
10222 ):
10223 """
10224 Args:
10225 url: The URL of the network node to connect to.
10226 num_connections: The number of websocket connections to be opened.
10227 """
10228 assert num_connections > 0
10229 self._num_connections = num_connections
10230 self.wait_for_finalization = wait_for_finalization
10231 self._connection_queue = queue.Queue(num_connections)
10232
10233 for _ in range(num_connections):
10234 self._connection_queue.put(SubstrateInterface(url))
10235
10236 @property
10237 def connections(self) -> int:
10238 """
10239 Gets the maximum allowed number of simultaneous connections to the
10240 network node.
10241 """
10242 return self._num_connections
10243
10244 @contextmanager
10245 def get_conn(self, timeout: float | None = None, init: bool = False):
10246 """
10247 Context manager to get a connection from the pool.
10248
10249 Tries to get a connection from the pool queue. If the queue is empty,
10250 it blocks for `timeout` seconds until a connection is available. If
10251 `timeout` is None, it blocks indefinitely.
10252
10253 Args:
10254 timeout: The maximum time in seconds to wait for a connection.
10255
10256 Yields:
10257 The connection object from the pool.
10258
10259 Raises:
10260 QueueEmptyError: If no connection is available within the timeout
10261 period.
10262 """
10263 conn = self._connection_queue.get(timeout=timeout)
10264 if init:
10265 conn.init_runtime() # type: ignore
10266 try:
10267 yield conn
10268 finally:
10269 self._connection_queue.put(conn)
10270
10271 def _get_storage_keys(
10272 self,
10273 storage: str,
10274 queries: list[tuple[str, list[Any]]],
10275 block_hash: str | None,
10276 ):
10277
10278 send: list[tuple[str, list[Any]]] = []
10279 prefix_list: list[Any] = []
10280
10281 key_idx = 0
10282 with self.get_conn(init=True) as substrate:
10283 for function, params in queries:
10284 storage_key = StorageKey.create_from_storage_function( # type: ignore
10285 storage, function, params, runtime_config=substrate.runtime_config, metadata=substrate.metadata # type: ignore
10286 )
10287
10288 prefix = storage_key.to_hex()
10289 prefix_list.append(prefix)
10290 send.append(("state_getKeys", [prefix, block_hash]))
10291 key_idx += 1
10292 return send, prefix_list
10293
10294 def _get_lists(
10295 self,
10296 storage_module: str,
10297 queries: list[tuple[str, list[Any]]],
10298 substrate: SubstrateInterface,
10299 ) -> list[tuple[Any, Any, Any, Any, str]]:
10300 """
10301 Generates a list of tuples containing parameters for each storage function based on the given functions and substrate interface.
10302
10303 Args:
10304 functions (dict[str, list[query_call]]): A dictionary where keys are storage module names and values are lists of tuples.
10305 Each tuple consists of a storage function name and its parameters.
10306 substrate: An instance of the SubstrateInterface class used to interact with the substrate.
10307
10308 Returns:
10309 A list of tuples in the format `(value_type, param_types, key_hashers, params, storage_function)` for each storage function in the given functions.
10310
10311 Example:
10312 >>> _get_lists(
10313 functions={'storage_module': [('storage_function', ['param1', 'param2'])]},
10314 substrate=substrate_instance
10315 )
10316 [('value_type', 'param_types', 'key_hashers', ['param1', 'param2'], 'storage_function'), ...]
10317 """
10318
10319 function_parameters: list[tuple[Any, Any, Any, Any, str]] = []
10320 metadata_pallet = substrate.metadata.get_metadata_pallet( # type: ignore
10321 storage_module) # type: ignore
10322 for storage_function, params in queries:
10323 storage_item = metadata_pallet.get_storage_function( # type: ignore
10324 storage_function) # type: ignore
10325 value_type = storage_item.get_value_type_string() # type: ignore
10326 param_types = storage_item.get_params_type_string() # type: ignore
10327 key_hashers = storage_item.get_param_hashers() # type: ignore
10328 function_parameters.append(
10329 (value_type, param_types, key_hashers,
10330 params, storage_function) # type: ignore
10331 )
10332 return function_parameters
10333
10334 def _send_batch(
10335 self,
10336 batch_payload: list[Any],
10337 request_ids: list[int],
10338 extract_result: bool = True,
10339 ):
10340 """
10341 Sends a batch of requests to the substrate and collects the results.
10342
10343 Args:
10344 substrate: An instance of the substrate interface.
10345 batch_payload: The payload of the batch request.
10346 request_ids: A list of request IDs for tracking responses.
10347 results: A list to store the results of the requests.
10348 extract_result: Whether to extract the result from the response.
10349
10350 Raises:
10351 NetworkQueryError: If there is an `error` in the response message.
10352
10353 Note:
10354 No explicit return value as results are appended to the provided 'results' list.
10355 """
10356 results: list[str | dict[Any, Any]] = []
10357 with self.get_conn(init=True) as substrate:
10358 try:
10359 substrate.websocket.send( # type: ignore
10360 json.dumps(batch_payload)) # type: ignore
10361 except NetworkQueryError:
10362 pass
10363 while len(results) < len(request_ids):
10364 received_messages = json.loads(
10365 substrate.websocket.recv()) # type: ignore
10366 if isinstance(received_messages, dict):
10367 received_messages: list[dict[Any, Any]] = [
10368 received_messages]
10369
10370 for message in received_messages:
10371 if message.get("id") in request_ids:
10372 if extract_result:
10373 try:
10374 results.append(message["result"])
10375 except Exception:
10376 raise (
10377 RuntimeError(
10378 f"Error extracting result from message: {message}"
10379 )
10380 )
10381 else:
10382 results.append(message)
10383 if "error" in message:
10384 raise NetworkQueryError(message["error"])
10385
10386 return results
10387
10388 def _make_request_smaller(
10389 self,
10390 batch_request: list[tuple[T1, T2]],
10391 prefix_list: list[list[str]],
10392 fun_params: list[tuple[Any, Any, Any, Any, str]],
10393 ) -> tuple[list[list[tuple[T1, T2]]], list[Chunk]]:
10394 """
10395 Splits a batch of requests into smaller batches, each not exceeding the specified maximum size.
10396
10397 Args:
10398 batch_request: A list of requests to be sent in a batch.
10399 max_size: Maximum size of each batch in bytes.
10400
10401 Returns:
10402 A list of smaller request batches.
10403
10404 Example:
10405 >>> _make_request_smaller(batch_request=[('method1', 'params1'), ('method2', 'params2')], max_size=1000)
10406 [[('method1', 'params1')], [('method2', 'params2')]]
10407 """
10408 assert len(prefix_list) == len(fun_params) == len(batch_request)
10409
10410 def estimate_size(request: tuple[T1, T2]):
10411 """Convert the batch request to a string and measure its length"""
10412 return len(json.dumps(request))
10413
10414 # Initialize variables
10415 result: list[list[tuple[T1, T2]]] = []
10416 current_batch = []
10417 current_prefix_batch = []
10418 current_params_batch = []
10419 current_size = 0
10420
10421 chunk_list: list[Chunk] = []
10422
10423 # Iterate through each request in the batch
10424 for request, prefix, params in zip(batch_request, prefix_list, fun_params):
10425 request_size = estimate_size(request)
10426
10427 # Check if adding this request exceeds the max size
10428 if current_size + request_size > MAX_REQUEST_SIZE:
10429 # If so, start a new batch
10430
10431 # Essentiatly checks that it's not the first iteration
10432 if current_batch:
10433 chunk = Chunk(
10434 current_batch, current_prefix_batch, current_params_batch
10435 )
10436 chunk_list.append(chunk)
10437 result.append(current_batch)
10438
10439 current_batch = [request]
10440 current_prefix_batch = [prefix]
10441 current_params_batch = [params]
10442 current_size = request_size
10443 else:
10444 # Otherwise, add to the current batch
10445 current_batch.append(request)
10446 current_size += request_size
10447 current_prefix_batch.append(prefix)
10448 current_params_batch.append(params)
10449
10450 # Add the last batch if it's not empty
10451 if current_batch:
10452 result.append(current_batch)
10453 chunk = Chunk(current_batch, current_prefix_batch,
10454 current_params_batch)
10455 chunk_list.append(chunk)
10456
10457 return result, chunk_list
10458
10459 def _are_changes_equal(self, change_a: Any, change_b: Any):
10460 for (a, b), (c, d) in zip(change_a, change_b):
10461 if a != c or b != d:
10462 return False
10463
10464 def _rpc_request_batch(
10465 self, batch_requests: list[tuple[str, list[Any]]], extract_result: bool = True
10466 ) -> list[str]:
10467 """
10468 Sends batch requests to the substrate node using multiple threads and collects the results.
10469
10470 Args:
10471 substrate: An instance of the substrate interface.
10472 batch_requests : A list of requests to be sent in batches.
10473 max_size: Maximum size of each batch in bytes.
10474 extract_result: Whether to extract the result from the response message.
10475
10476 Returns:
10477 A list of results from the batch requests.
10478
10479 Example:
10480 >>> _rpc_request_batch(substrate_instance, [('method1', ['param1']), ('method2', ['param2'])])
10481 ['result1', 'result2', ...]
10482 """
10483
10484 chunk_results: list[Any] = []
10485 # smaller_requests = self._make_request_smaller(batch_requests)
10486 request_id = 0
10487 with ThreadPoolExecutor() as executor:
10488 futures: list[Future[list[str | dict[Any, Any]]]] = []
10489 for chunk in [batch_requests]:
10490 request_ids: list[int] = []
10491 batch_payload: list[Any] = []
10492 for method, params in chunk:
10493 request_id += 1
10494 request_ids.append(request_id)
10495 batch_payload.append(
10496 {
10497 "jsonrpc": "2.0",
10498 "method": method,
10499 "params": params,
10500 "id": request_id,
10501 }
10502 )
10503
10504 futures.append(
10505 executor.submit(
10506 self._send_batch,
10507 batch_payload=batch_payload,
10508 request_ids=request_ids,
10509 extract_result=extract_result,
10510 )
10511 )
10512 for future in futures:
10513 resul = future.result()
10514 chunk_results.append(resul)
10515 return chunk_results
10516
10517 def _rpc_request_batch_chunked(
10518 self, chunk_requests: list[Chunk], extract_result: bool = True
10519 ):
10520 """
10521 Sends batch requests to the substrate node using multiple threads and collects the results.
10522
10523 Args:
10524 substrate: An instance of the substrate interface.
10525 batch_requests : A list of requests to be sent in batches.
10526 max_size: Maximum size of each batch in bytes.
10527 extract_result: Whether to extract the result from the response message.
10528
10529 Returns:
10530 A list of results from the batch requests.
10531
10532 Example:
10533 >>> _rpc_request_batch(substrate_instance, [('method1', ['param1']), ('method2', ['param2'])])
10534 ['result1', 'result2', ...]
10535 """
10536
10537 def split_chunks(chunk: Chunk, chunk_info: list[Chunk], chunk_info_idx: int):
10538 manhattam_chunks: list[tuple[Any, Any]] = []
10539 mutaded_chunk_info = deepcopy(chunk_info)
10540 max_n_keys = 35000
10541 for query in chunk.batch_requests:
10542 result_keys = query[1][0]
10543 keys_amount = len(result_keys)
10544 if keys_amount > max_n_keys:
10545 mutaded_chunk_info.pop(chunk_info_idx)
10546 for i in range(0, keys_amount, max_n_keys):
10547 new_chunk = deepcopy(chunk)
10548 splitted_keys = result_keys[i: i + max_n_keys]
10549 splitted_query = deepcopy(query)
10550 splitted_query[1][0] = splitted_keys
10551 new_chunk.batch_requests = [splitted_query]
10552 manhattam_chunks.append(splitted_query)
10553 mutaded_chunk_info.insert(chunk_info_idx, new_chunk)
10554 else:
10555 manhattam_chunks.append(query)
10556 return manhattam_chunks, mutaded_chunk_info
10557
10558 assert len(chunk_requests) > 0
10559 mutated_chunk_info: list[Chunk] = []
10560 chunk_results: list[Any] = []
10561 # smaller_requests = self._make_request_smaller(batch_requests)
10562 request_id = 0
10563
10564 with ThreadPoolExecutor() as executor:
10565 futures: list[Future[list[str | dict[Any, Any]]]] = []
10566 for idx, macro_chunk in enumerate(chunk_requests):
10567 _, mutated_chunk_info = split_chunks(
10568 macro_chunk, chunk_requests, idx)
10569 for chunk in mutated_chunk_info:
10570 request_ids: list[int] = []
10571 batch_payload: list[Any] = []
10572 for method, params in chunk.batch_requests:
10573 # for method, params in micro_chunk:
10574 request_id += 1
10575 request_ids.append(request_id)
10576 batch_payload.append(
10577 {
10578 "jsonrpc": "2.0",
10579 "method": method,
10580 "params": params,
10581 "id": request_id,
10582 }
10583 )
10584 futures.append(
10585 executor.submit(
10586 self._send_batch,
10587 batch_payload=batch_payload,
10588 request_ids=request_ids,
10589 extract_result=extract_result,
10590 )
10591 )
10592 for future in futures:
10593 resul = future.result()
10594 chunk_results.append(resul)
10595 return chunk_results, mutated_chunk_info
10596
10597 def _decode_response(
10598 self,
10599 response: list[str],
10600 function_parameters: list[tuple[Any, Any, Any, Any, str]],
10601 prefix_list: list[Any],
10602 block_hash: str,
10603 ) -> dict[str, dict[Any, Any]]:
10604 """
10605 Decodes a response from the substrate interface and organizes the data into a dictionary.
10606
10607 Args:
10608 response: A list of encoded responses from a substrate query.
10609 function_parameters: A list of tuples containing the parameters for each storage function.
10610 last_keys: A list of the last keys used in the substrate query.
10611 prefix_list: A list of prefixes used in the substrate query.
10612 substrate: An instance of the SubstrateInterface class.
10613 block_hash: The hash of the block to be queried.
10614
10615 Returns:
10616 A dictionary where each key is a storage function name and the value is another dictionary.
10617 This inner dictionary's key is the decoded key from the response and the value is the corresponding decoded value.
10618
10619 Raises:
10620 ValueError: If an unsupported hash type is encountered in the `concat_hash_len` function.
10621
10622 Example:
10623 >>> _decode_response(
10624 response=[...],
10625 function_parameters=[...],
10626 last_keys=[...],
10627 prefix_list=[...],
10628 substrate=substrate_instance,
10629 block_hash="0x123..."
10630 )
10631 {'storage_function_name': {decoded_key: decoded_value, ...}, ...}
10632 """
10633
10634 def concat_hash_len(key_hasher: str) -> int:
10635 """
10636 Determines the length of the hash based on the given key hasher type.
10637
10638 Args:
10639 key_hasher: The type of key hasher.
10640
10641 Returns:
10642 The length of the hash corresponding to the given key hasher type.
10643
10644 Raises:
10645 ValueError: If the key hasher type is not supported.
10646
10647 Example:
10648 >>> concat_hash_len("Blake2_128Concat")
10649 16
10650 """
10651
10652 if key_hasher == "Blake2_128Concat":
10653 return 16
10654 elif key_hasher == "Twox64Concat":
10655 return 8
10656 elif key_hasher == "Identity":
10657 return 0
10658 else:
10659 raise ValueError("Unsupported hash type")
10660
10661 assert len(response) == len(function_parameters) == len(prefix_list)
10662 result_dict: dict[str, dict[Any, Any]] = {}
10663 for res, fun_params_tuple, prefix in zip(
10664 response, function_parameters, prefix_list
10665 ):
10666 if not res:
10667 continue
10668 res = res[0]
10669 changes = res["changes"] # type: ignore
10670 value_type, param_types, key_hashers, params, storage_function = (
10671 fun_params_tuple
10672 )
10673 with self.get_conn(init=True) as substrate:
10674 for item in changes:
10675 # Determine type string
10676 key_type_string: list[Any] = []
10677 for n in range(len(params), len(param_types)):
10678 key_type_string.append(
10679 f"[u8; {concat_hash_len(key_hashers[n])}]"
10680 )
10681 key_type_string.append(param_types[n])
10682
10683 item_key_obj = substrate.decode_scale( # type: ignore
10684 type_string=f"({', '.join(key_type_string)})",
10685 scale_bytes="0x" + item[0][len(prefix):],
10686 return_scale_obj=True,
10687 block_hash=block_hash,
10688 )
10689 # strip key_hashers to use as item key
10690 if len(param_types) - len(params) == 1:
10691 item_key = item_key_obj.value_object[1] # type: ignore
10692 else:
10693 item_key = tuple( # type: ignore
10694 item_key_obj.value_object[key + 1] # type: ignore
10695 for key in range( # type: ignore
10696 len(params), len(param_types) + 1, 2
10697 )
10698 )
10699
10700 item_value = substrate.decode_scale( # type: ignore
10701 type_string=value_type,
10702 scale_bytes=item[1],
10703 return_scale_obj=True,
10704 block_hash=block_hash,
10705 )
10706 result_dict.setdefault(storage_function, {})
10707
10708 result_dict[storage_function][item_key.value] = item_value.value # type: ignore
10709
10710 return result_dict
10711
10712 def query_batch(
10713 self, functions: dict[str, list[tuple[str, list[Any]]]]
10714 ) -> dict[str, str]:
10715 """
10716 Executes batch queries on a substrate and returns results in a dictionary format.
10717
10718 Args:
10719 substrate: An instance of SubstrateInterface to interact with the substrate.
10720 functions (dict[str, list[query_call]]): A dictionary mapping module names to lists of query calls (function name and parameters).
10721
10722 Returns:
10723 A dictionary where keys are storage function names and values are the query results.
10724
10725 Raises:
10726 Exception: If no result is found from the batch queries.
10727
10728 Example:
10729 >>> query_batch(substrate_instance, {'module_name': [('function_name', ['param1', 'param2'])]})
10730 {'function_name': 'query_result', ...}
10731 """
10732
10733 result = None
10734 with self.get_conn(init=True) as substrate:
10735 for module, queries in functions.items():
10736 storage_keys: list[Any] = []
10737 for fn, params in queries:
10738 storage_function = substrate.create_storage_key( # type: ignore
10739 pallet=module, storage_function=fn, params=params
10740 )
10741 storage_keys.append(storage_function)
10742
10743 block_hash = substrate.get_block_hash()
10744 responses: list[Any] = substrate.query_multi( # type: ignore
10745 storage_keys=storage_keys, block_hash=block_hash
10746 )
10747
10748 result: dict[str, str] | None = {}
10749
10750 for item in responses:
10751 fun = item[0]
10752 query = item[1]
10753 storage_fun = fun.storage_function
10754 result[storage_fun] = query.value
10755
10756 if result is None:
10757 raise Exception("No result")
10758
10759 return result
10760
10761 def query_batch_map(
10762 self,
10763 functions: dict[str, list[tuple[str, list[Any]]]],
10764 block_hash: str | None = None,
10765 ) -> dict[str, dict[Any, Any]]:
10766 """
10767 Queries multiple storage functions using a map batch approach and returns the combined result.
10768
10769 Args:
10770 substrate: An instance of SubstrateInterface for substrate interaction.
10771 functions (dict[str, list[query_call]]): A dictionary mapping module names to lists of query calls.
10772
10773 Returns:
10774 The combined result of the map batch query.
10775
10776 Example:
10777 >>> query_batch_map(substrate_instance, {'module_name': [('function_name', ['param1', 'param2'])]})
10778 # Returns the combined result of the map batch query
10779 """
10780 multi_result: dict[str, dict[Any, Any]] = {}
10781
10782 def recursive_update(
10783 d: dict[str, dict[T1, T2] | dict[str, Any]],
10784 u: Mapping[str, dict[Any, Any] | str],
10785 ) -> dict[str, dict[T1, T2]]:
10786 for k, v in u.items():
10787 if isinstance(v, dict):
10788 d[k] = recursive_update(d.get(k, {}), v) # type: ignore
10789 else:
10790 d[k] = v # type: ignore
10791 return d # type: ignore
10792
10793 def get_page():
10794 send, prefix_list = self._get_storage_keys(
10795 storage, queries, block_hash)
10796 with self.get_conn(init=True) as substrate:
10797 function_parameters = self._get_lists(
10798 storage, queries, substrate)
10799 responses = self._rpc_request_batch(send)
10800 # assumption because send is just the storage_function keys
10801 # so it should always be really small regardless of the amount of queries
10802 assert len(responses) == 1
10803 res = responses[0]
10804 built_payload: list[tuple[str, list[Any]]] = []
10805 for result_keys in res:
10806 built_payload.append(
10807 ("state_queryStorageAt", [result_keys, block_hash])
10808 )
10809 _, chunks_info = self._make_request_smaller(
10810 built_payload, prefix_list, function_parameters
10811 )
10812 chunks_response, chunks_info = self._rpc_request_batch_chunked(
10813 chunks_info)
10814 return chunks_response, chunks_info
10815
10816 if not block_hash:
10817 with self.get_conn(init=True) as substrate:
10818 block_hash = substrate.get_block_hash()
10819 for storage, queries in functions.items():
10820 chunks, chunks_info = get_page()
10821 # if this doesn't happen something is wrong on the code
10822 # and we won't be able to decode the data properly
10823 assert len(chunks) == len(chunks_info)
10824 for chunk_info, response in zip(chunks_info, chunks):
10825 storage_result = self._decode_response(
10826 response, chunk_info.fun_params, chunk_info.prefix_list, block_hash
10827 )
10828 multi_result = recursive_update(multi_result, storage_result)
10829
10830 return multi_result
10831
10832 def query(
10833 self,
10834 name: str,
10835 params: list[Any] = [],
10836 module: str = "SubspaceModule",
10837 ) -> Any:
10838 """
10839 Queries a storage function on the network.
10840
10841 Sends a query to the network and retrieves data from a
10842 specified storage function.
10843
10844 Args:
10845 name: The name of the storage function to query.
10846 params: The parameters to pass to the storage function.
10847 module: The module where the storage function is located.
10848
10849 Returns:
10850 The result of the query from the network.
10851
10852 Raises:
10853 NetworkQueryError: If the query fails or is invalid.
10854 """
10855
10856 result = self.query_batch({module: [(name, params)]})
10857
10858 return result[name]
10859
10860 def query_map(
10861 self,
10862 name: str,
10863 params: list[Any] = [],
10864 module: str = "SubspaceModule",
10865 extract_value: bool = True,
10866 ) -> dict[Any, Any]:
10867 """
10868 Queries a storage map from a network node.
10869
10870 Args:
10871 name: The name of the storage map to query.
10872 params: A list of parameters for the query.
10873 module: The module in which the storage map is located.
10874
10875 Returns:
10876 A dictionary representing the key-value pairs
10877 retrieved from the storage map.
10878
10879 Raises:
10880 QueryError: If the query to the network fails or is invalid.
10881 """
10882
10883 result = self.query_batch_map({module: [(name, params)]})
10884
10885 if extract_value:
10886 return {k.value: v.value for k, v in result} # type: ignore
10887
10888 return result
10889
10890 def compose_call(
10891 self,
10892 fn: str,
10893 params: dict[str, Any],
10894 key: Keypair,
10895 module: str = "SubspaceModule",
10896 wait_for_inclusion: bool = True,
10897 wait_for_finalization: bool | None = None,
10898 sudo: bool = False,
10899 ) -> ExtrinsicReceipt:
10900 """
10901 Composes and submits a call to the network node.
10902
10903 Composes and signs a call with the provided keypair, and submits it to
10904 the network. The call can be a standard extrinsic or a sudo extrinsic if
10905 elevated permissions are required. The method can optionally wait for
10906 the call's inclusion in a block and/or its finalization.
10907
10908 Args:
10909 fn: The function name to call on the network.
10910 params: A dictionary of parameters for the call.
10911 key: The keypair for signing the extrinsic.
10912 module: The module containing the function.
10913 wait_for_inclusion: Wait for the call's inclusion in a block.
10914 wait_for_finalization: Wait for the transaction's finalization.
10915 sudo: Execute the call as a sudo (superuser) operation.
10916
10917 Returns:
10918 The receipt of the submitted extrinsic, if
10919 `wait_for_inclusion` is True. Otherwise, returns a string
10920 identifier of the extrinsic.
10921
10922 Raises:
10923 ChainTransactionError: If the transaction fails.
10924 """
10925
10926 with self.get_conn() as substrate:
10927 if wait_for_finalization is None:
10928 wait_for_finalization = self.wait_for_finalization
10929
10930 call = substrate.compose_call( # type: ignore
10931 call_module=module, call_function=fn, call_params=params
10932 )
10933 if sudo:
10934 call = substrate.compose_call( # type: ignore
10935 call_module="Sudo",
10936 call_function="sudo",
10937 call_params={
10938 "call": call.value, # type: ignore
10939 },
10940 )
10941
10942 extrinsic = substrate.create_signed_extrinsic( # type: ignore
10943 call=call, keypair=key # type: ignore
10944 ) # type: ignore
10945 response = substrate.submit_extrinsic(
10946 extrinsic=extrinsic,
10947 wait_for_inclusion=wait_for_inclusion,
10948 wait_for_finalization=wait_for_finalization,
10949 )
10950 if wait_for_inclusion:
10951 if not response.is_success:
10952 raise ChainTransactionError(
10953 response.error_message, response # type: ignore
10954 )
10955
10956 return response
10957
10958 def compose_call_multisig(
10959 self,
10960 fn: str,
10961 params: dict[str, Any],
10962 key: Keypair,
10963 signatories: list[Ss58Address],
10964 threshold: int,
10965 module: str = "SubspaceModule",
10966 wait_for_inclusion: bool = True,
10967 wait_for_finalization: bool | None = None,
10968 sudo: bool = False,
10969 era: dict[str, int] | None = None,
10970 ) -> ExtrinsicReceipt:
10971 """
10972 Composes and submits a multisignature call to the network node.
10973
10974 This method allows the composition and submission of a call that
10975 requires multiple signatures for execution, known as a multisignature
10976 call. It supports specifying signatories, a threshold of signatures for
10977 the call's execution, and an optional era for the call's mortality. The
10978 call can be a standard extrinsic, a sudo extrinsic for elevated
10979 permissions, or a multisig extrinsic if multiple signatures are
10980 required. Optionally, the method can wait for the call's inclusion in a
10981 block and/or its finalization. Make sure to pass all keys,
10982 that are part of the multisignature.
10983
10984 Args:
10985 fn: The function name to call on the network. params: A dictionary
10986 of parameters for the call. key: The keypair for signing the
10987 extrinsic. signatories: List of SS58 addresses of the signatories.
10988 Include ALL KEYS that are part of the multisig. threshold: The
10989 minimum number of signatories required to execute the extrinsic.
10990 module: The module containing the function to call.
10991 wait_for_inclusion: Whether to wait for the call's inclusion in a
10992 block. wait_for_finalization: Whether to wait for the transaction's
10993 finalization. sudo: Execute the call as a sudo (superuser)
10994 operation. era: Specifies the call's mortality in terms of blocks in
10995 the format
10996 {'period': amount_blocks}. If omitted, the extrinsic is
10997 immortal.
10998
10999 Returns:
11000 The receipt of the submitted extrinsic if `wait_for_inclusion` is
11001 True. Otherwise, returns a string identifier of the extrinsic.
11002
11003 Raises:
11004 ChainTransactionError: If the transaction fails.
11005 """
11006
11007 # getting the call ready
11008 with self.get_conn() as substrate:
11009 if wait_for_finalization is None:
11010 wait_for_finalization = self.wait_for_finalization
11011
11012 # prepares the `GenericCall` object
11013 call = substrate.compose_call( # type: ignore
11014 call_module=module, call_function=fn, call_params=params
11015 )
11016 if sudo:
11017 call = substrate.compose_call( # type: ignore
11018 call_module="Sudo",
11019 call_function="sudo",
11020 call_params={
11021 "call": call.value, # type: ignore
11022 },
11023 )
11024
11025 # modify the rpc methods at runtime, to allow for correct payment
11026 # fee calculation parity has a bug in this version,
11027 # where the method has to be removed
11028 rpc_methods = substrate.config.get("rpc_methods") # type: ignore
11029
11030 if "state_call" in rpc_methods: # type: ignore
11031 rpc_methods.remove("state_call") # type: ignore
11032
11033 # create the multisig account
11034 multisig_acc = substrate.generate_multisig_account( # type: ignore
11035 signatories, threshold
11036 )
11037
11038 # send the multisig extrinsic
11039 extrinsic = substrate.create_multisig_extrinsic( # type: ignore
11040 call=call,
11041 keypair=key,
11042 multisig_account=multisig_acc, # type: ignore
11043 era=era, # type: ignore
11044 ) # type: ignore
11045
11046 response = substrate.submit_extrinsic(
11047 extrinsic=extrinsic,
11048 wait_for_inclusion=wait_for_inclusion,
11049 wait_for_finalization=wait_for_finalization,
11050 )
11051
11052 if wait_for_inclusion:
11053 if not response.is_success:
11054 raise ChainTransactionError(
11055 response.error_message, response # type: ignore
11056 )
11057
11058 return response
11059
11060 def transfer(
11061 self,
11062 key: Keypair,
11063 amount: int,
11064 dest: Ss58Address,
11065 ) -> ExtrinsicReceipt:
11066 """
11067 Transfers a specified amount of tokens from the signer's account to the
11068 specified account.
11069
11070 Args:
11071 key: The keypair associated with the sender's account.
11072 amount: The amount to transfer, in nanotokens.
11073 dest: The SS58 address of the recipient.
11074
11075 Returns:
11076 A receipt of the transaction.
11077
11078 Raises:
11079 InsufficientBalanceError: If the sender's account does not have
11080 enough balance.
11081 ChainTransactionError: If the transaction fails.
11082 """
11083
11084 params = {"dest": dest, "value": amount}
11085
11086 return self.compose_call(
11087 module="Balances", fn="transfer", params=params, key=key
11088 )
11089
11090 def transfer_multiple(
11091 self,
11092 key: Keypair,
11093 destinations: list[Ss58Address],
11094 amounts: list[int],
11095 netuid: str | int = 0,
11096 ) -> ExtrinsicReceipt:
11097 """
11098 Transfers specified amounts of tokens from the signer's account to
11099 multiple target accounts.
11100
11101 The `destinations` and `amounts` lists must be of the same length.
11102
11103 Args:
11104 key: The keypair associated with the sender's account.
11105 destinations: A list of SS58 addresses of the recipients.
11106 amounts: Amount to transfer to each recipient, in nanotokens.
11107 netuid: The network identifier.
11108
11109 Returns:
11110 A receipt of the transaction.
11111
11112 Raises:
11113 InsufficientBalanceError: If the sender's account does not have
11114 enough balance for all transfers.
11115 ChainTransactionError: If the transaction fails.
11116 """
11117
11118 assert len(destinations) == len(amounts)
11119
11120 # extract existential deposit from amounts
11121 existential_deposit = self.get_existential_deposit()
11122 amounts = [a - existential_deposit for a in amounts]
11123
11124 params = {
11125 "netuid": netuid,
11126 "destinations": destinations,
11127 "amounts": amounts,
11128 }
11129
11130 return self.compose_call(
11131 module="SubspaceModule", fn="transfer_multiple", params=params, key=key
11132 )
11133
11134 def stake(
11135 self,
11136 key: Keypair,
11137 amount: int,
11138 dest: Ss58Address,
11139 netuid: int = 0,
11140 ) -> ExtrinsicReceipt:
11141 """
11142 Stakes the specified amount of tokens to a module key address.
11143
11144 Args:
11145 key: The keypair associated with the staker's account.
11146 amount: The amount of tokens to stake, in nanotokens.
11147 dest: The SS58 address of the module key to stake to.
11148 netuid: The network identifier.
11149
11150 Returns:
11151 A receipt of the staking transaction.
11152
11153 Raises:
11154 InsufficientBalanceError: If the staker's account does not have
11155 enough balance.
11156 ChainTransactionError: If the transaction fails.
11157 """
11158
11159 params = {"amount": amount, "netuid": netuid, "module_key": dest}
11160
11161 return self.compose_call(fn="add_stake", params=params, key=key)
11162
11163 def unstake(
11164 self,
11165 key: Keypair,
11166 amount: int,
11167 dest: Ss58Address,
11168 netuid: int = 0,
11169 ) -> ExtrinsicReceipt:
11170 """
11171 Unstakes the specified amount of tokens from a module key address.
11172
11173 Args:
11174 key: The keypair associated with the unstaker's account.
11175 amount: The amount of tokens to unstake, in nanotokens.
11176 dest: The SS58 address of the module key to unstake from.
11177 netuid: The network identifier.
11178
11179 Returns:
11180 A receipt of the unstaking transaction.
11181
11182 Raises:
11183 InsufficientStakeError: If the staked key does not have enough
11184 staked tokens by the signer key.
11185 ChainTransactionError: If the transaction fails.
11186 """
11187
11188 params = {"amount": amount, "netuid": netuid, "module_key": dest}
11189 return self.compose_call(fn="remove_stake", params=params, key=key)
11190
11191 def update_module(
11192 self,
11193 key: Keypair,
11194 name: str,
11195 address: str,
11196 metadata: str | None = None,
11197 delegation_fee: int = 20,
11198 netuid: int = 0,
11199 ) -> ExtrinsicReceipt:
11200 """
11201 Updates the parameters of a registered module.
11202
11203 The delegation fee must be an integer between 0 and 100.
11204
11205 Args:
11206 key: The keypair associated with the module's account.
11207 name: The new name for the module. If None, the name is not updated.
11208 address: The new address for the module.
11209 If None, the address is not updated.
11210 delegation_fee: The new delegation fee for the module,
11211 between 0 and 100.
11212 netuid: The network identifier.
11213
11214 Returns:
11215 A receipt of the module update transaction.
11216
11217 Raises:
11218 InvalidParameterError: If the provided parameters are invalid.
11219 ChainTransactionError: If the transaction fails.
11220 """
11221
11222 assert isinstance(delegation_fee, int)
11223
11224 params = {
11225 "netuid": netuid,
11226 "name": name,
11227 "address": address,
11228 "delegation_fee": delegation_fee,
11229 "metadata": metadata,
11230 }
11231
11232 response = self.compose_call("update_module", params=params, key=key)
11233
11234 return response
11235
11236 def register_module(
11237 self,
11238 key: Keypair,
11239 name: str,
11240 address: str | None = None,
11241 subnet: str = "commune",
11242 min_stake: int | None = None,
11243 metadata: str | None = None,
11244 ) -> ExtrinsicReceipt:
11245 """
11246 Registers a new module in the network.
11247
11248 Args:
11249 key: The keypair used for registering the module.
11250 name: The name of the module. If None, a default or previously
11251 set name is used. # How does this work?
11252 address: The address of the module. If None, a default or
11253 previously set address is used. # How does this work?
11254 subnet: The network subnet to register the module in.
11255 min_stake: The minimum stake required for the module, in nanotokens.
11256 If None, a default value is used.
11257
11258 Returns:
11259 A receipt of the registration transaction.
11260
11261 Raises:
11262 InvalidParameterError: If the provided parameters are invalid.
11263 ChainTransactionError: If the transaction fails.
11264 """
11265
11266 stake = self.get_min_stake() if min_stake is None else min_stake
11267
11268 key_addr = key.ss58_address
11269
11270 params = {
11271 "network": subnet,
11272 "address": address,
11273 "name": name,
11274 "stake": stake,
11275 "module_key": key_addr,
11276 "metadata": metadata,
11277 }
11278
11279 response = self.compose_call("register", params=params, key=key)
11280 return response
11281
11282 def vote(
11283 self,
11284 key: Keypair,
11285 uids: list[int],
11286 weights: list[int],
11287 netuid: int = 0,
11288 ) -> ExtrinsicReceipt:
11289 """
11290 Casts votes on a list of module UIDs with corresponding weights.
11291
11292 The length of the UIDs list and the weights list should be the same.
11293 Each weight corresponds to the UID at the same index.
11294
11295 Args:
11296 key: The keypair used for signing the vote transaction.
11297 uids: A list of module UIDs to vote on.
11298 weights: A list of weights corresponding to each UID.
11299 netuid: The network identifier.
11300
11301 Returns:
11302 A receipt of the voting transaction.
11303
11304 Raises:
11305 InvalidParameterError: If the lengths of UIDs and weights lists
11306 do not match.
11307 ChainTransactionError: If the transaction fails.
11308 """
11309
11310 assert len(uids) == len(weights)
11311
11312 params = {
11313 "uids": uids,
11314 "weights": weights,
11315 "netuid": netuid,
11316 }
11317
11318 response = self.compose_call("set_weights", params=params, key=key)
11319
11320 return response
11321
11322 def update_subnet(
11323 self,
11324 key: Keypair,
11325 params: SubnetParams,
11326 netuid: int = 0,
11327 ) -> ExtrinsicReceipt:
11328 """
11329 Update a subnet's configuration.
11330
11331 It requires the founder key for authorization.
11332
11333 Args:
11334 key: The founder keypair of the subnet.
11335 params: The new parameters for the subnet.
11336 netuid: The network identifier.
11337
11338 Returns:
11339 A receipt of the subnet update transaction.
11340
11341 Raises:
11342 AuthorizationError: If the key is not authorized.
11343 ChainTransactionError: If the transaction fails.
11344 """
11345
11346 general_params = dict(params)
11347 general_params["netuid"] = netuid
11348
11349 response = self.compose_call(
11350 fn="update_subnet",
11351 params=general_params,
11352 key=key,
11353 )
11354
11355 return response
11356
11357 def transfer_stake(
11358 self,
11359 key: Keypair,
11360 amount: int,
11361 from_module_key: Ss58Address,
11362 dest_module_address: Ss58Address,
11363 netuid: int = 0,
11364 ) -> ExtrinsicReceipt:
11365 """
11366 Realocate staked tokens from one staked module to another module.
11367
11368 Args:
11369 key: The keypair associated with the account that is delegating the tokens.
11370 amount: The amount of staked tokens to transfer, in nanotokens.
11371 from_module_key: The SS58 address of the module you want to transfer from (currently delegated by the key).
11372 dest_module_address: The SS58 address of the destination (newly delegated key).
11373 netuid: The network identifier.
11374
11375 Returns:
11376 A receipt of the stake transfer transaction.
11377
11378 Raises:
11379 InsufficientStakeError: If the source module key does not have
11380 enough staked tokens. ChainTransactionError: If the transaction
11381 fails.
11382 """
11383
11384 amount = amount - self.get_existential_deposit()
11385
11386 params = {
11387 "amount": amount,
11388 "netuid": netuid,
11389 "module_key": from_module_key,
11390 "new_module_key": dest_module_address,
11391 }
11392
11393 response = self.compose_call("transfer_stake", key=key, params=params)
11394
11395 return response
11396
11397 def multiunstake(
11398 self,
11399 key: Keypair,
11400 keys: list[Ss58Address],
11401 amounts: list[int],
11402 netuid: int = 0,
11403 ) -> ExtrinsicReceipt:
11404 """
11405 Unstakes tokens from multiple module keys.
11406
11407 And the lists `keys` and `amounts` must be of the same length. Each
11408 amount corresponds to the module key at the same index.
11409
11410 Args:
11411 key: The keypair associated with the unstaker's account.
11412 keys: A list of SS58 addresses of the module keys to unstake from.
11413 amounts: A list of amounts to unstake from each module key,
11414 in nanotokens.
11415 netuid: The network identifier.
11416
11417 Returns:
11418 A receipt of the multi-unstaking transaction.
11419
11420 Raises:
11421 MismatchedLengthError: If the lengths of keys and amounts lists do
11422 not match. InsufficientStakeError: If any of the module keys do not
11423 have enough staked tokens. ChainTransactionError: If the transaction
11424 fails.
11425 """
11426
11427 assert len(keys) == len(amounts)
11428
11429 params = {"netuid": netuid, "module_keys": keys, "amounts": amounts}
11430
11431 response = self.compose_call(
11432 "remove_stake_multiple", params=params, key=key)
11433
11434 return response
11435
11436 def multistake(
11437 self,
11438 key: Keypair,
11439 keys: list[Ss58Address],
11440 amounts: list[int],
11441 netuid: int = 0,
11442 ) -> ExtrinsicReceipt:
11443 """
11444 Stakes tokens to multiple module keys.
11445
11446 The lengths of the `keys` and `amounts` lists must be the same. Each
11447 amount corresponds to the module key at the same index.
11448
11449 Args:
11450 key: The keypair associated with the staker's account.
11451 keys: A list of SS58 addresses of the module keys to stake to.
11452 amounts: A list of amounts to stake to each module key,
11453 in nanotokens.
11454 netuid: The network identifier.
11455
11456 Returns:
11457 A receipt of the multi-staking transaction.
11458
11459 Raises:
11460 MismatchedLengthError: If the lengths of keys and amounts lists
11461 do not match.
11462 ChainTransactionError: If the transaction fails.
11463 """
11464
11465 assert len(keys) == len(amounts)
11466
11467 params = {
11468 "module_keys": keys,
11469 "amounts": amounts,
11470 "netuid": netuid,
11471 }
11472
11473 response = self.compose_call(
11474 "add_stake_multiple", params=params, key=key)
11475
11476 return response
11477
11478 def add_profit_shares(
11479 self,
11480 key: Keypair,
11481 keys: list[Ss58Address],
11482 shares: list[int],
11483 ) -> ExtrinsicReceipt:
11484 """
11485 Allocates profit shares to multiple keys.
11486
11487 The lists `keys` and `shares` must be of the same length,
11488 with each share amount corresponding to the key at the same index.
11489
11490 Args:
11491 key: The keypair associated with the account
11492 distributing the shares.
11493 keys: A list of SS58 addresses to allocate shares to.
11494 shares: A list of share amounts to allocate to each key,
11495 in nanotokens.
11496
11497 Returns:
11498 A receipt of the profit sharing transaction.
11499
11500 Raises:
11501 MismatchedLengthError: If the lengths of keys and shares
11502 lists do not match.
11503 ChainTransactionError: If the transaction fails.
11504 """
11505
11506 assert len(keys) == len(shares)
11507
11508 params = {"keys": keys, "shares": shares}
11509
11510 response = self.compose_call(
11511 "add_profit_shares", params=params, key=key)
11512
11513 return response
11514
11515 def add_subnet_proposal(
11516 self, key: Keypair, params: SubnetParams, netuid: int = 0
11517 ) -> ExtrinsicReceipt:
11518 """
11519 Submits a proposal for creating or modifying a subnet within the
11520 network.
11521
11522 The proposal includes various parameters like the name, founder, share
11523 allocations, and other subnet-specific settings.
11524
11525 Args:
11526 key: The keypair used for signing the proposal transaction.
11527 params: The parameters for the subnet proposal.
11528 netuid: The network identifier.
11529
11530 Returns:
11531 A receipt of the subnet proposal transaction.
11532
11533 Raises:
11534 InvalidParameterError: If the provided subnet
11535 parameters are invalid.
11536 ChainTransactionError: If the transaction fails.
11537 """
11538
11539 general_params = dict(params)
11540 general_params["netuid"] = netuid
11541
11542 response = self.compose_call(
11543 fn="add_subnet_proposal",
11544 params=general_params,
11545 key=key,
11546 )
11547
11548 return response
11549
11550 def add_custom_proposal(
11551 self,
11552 key: Keypair,
11553 cid: str,
11554 ) -> ExtrinsicReceipt:
11555
11556 params = {"data": cid}
11557
11558 response = self.compose_call(
11559 fn="add_custom_proposal", params=params, key=key)
11560 return response
11561
11562 def add_custom_subnet_proposal(
11563 self,
11564 key: Keypair,
11565 cid: str,
11566 netuid: int = 0,
11567 ) -> ExtrinsicReceipt:
11568 """
11569 Submits a proposal for creating or modifying a custom subnet within the
11570 network.
11571
11572 The proposal includes various parameters like the name, founder, share
11573 allocations, and other subnet-specific settings.
11574
11575 Args:
11576 key: The keypair used for signing the proposal transaction.
11577 params: The parameters for the subnet proposal.
11578 netuid: The network identifier.
11579
11580 Returns:
11581 A receipt of the subnet proposal transaction.
11582 """
11583
11584 params = {
11585 "data": cid,
11586 "netuid": netuid,
11587 }
11588
11589 response = self.compose_call(
11590 fn="add_custom_subnet_proposal",
11591 params=params,
11592 key=key,
11593 )
11594
11595 return response
11596
11597 def add_global_proposal(
11598 self,
11599 key: Keypair,
11600 params: NetworkParams,
11601 ) -> ExtrinsicReceipt:
11602 """
11603 Submits a proposal for altering the global network parameters.
11604
11605 Allows for the submission of a proposal to
11606 change various global parameters
11607 of the network, such as emission rates, rate limits, and voting
11608 thresholds. It is used to
11609 suggest changes that affect the entire network's operation.
11610
11611 Args:
11612 key: The keypair used for signing the proposal transaction.
11613 params: A dictionary containing global network parameters
11614 like maximum allowed subnets, modules,
11615 transaction rate limits, and others.
11616
11617 Returns:
11618 A receipt of the global proposal transaction.
11619
11620 Raises:
11621 InvalidParameterError: If the provided network
11622 parameters are invalid.
11623 ChainTransactionError: If the transaction fails.
11624 """
11625
11626 general_params = vars(params)
11627 response = self.compose_call(
11628 fn="add_global_proposal",
11629 params=general_params,
11630 key=key,
11631 )
11632
11633 return response
11634
11635 def vote_on_proposal(
11636 self,
11637 key: Keypair,
11638 proposal_id: int,
11639 agree: bool,
11640 ) -> ExtrinsicReceipt:
11641 """
11642 Casts a vote on a specified proposal within the network.
11643
11644 Args:
11645 key: The keypair used for signing the vote transaction.
11646 proposal_id: The unique identifier of the proposal to vote on.
11647
11648 Returns:
11649 A receipt of the voting transaction in nanotokens.
11650
11651 Raises:
11652 InvalidProposalIDError: If the provided proposal ID does not
11653 exist or is invalid.
11654 ChainTransactionError: If the transaction fails.
11655 """
11656
11657 params = {"proposal_id": proposal_id, "agree": agree}
11658
11659 response = self.compose_call("vote_proposal", key=key, params=params)
11660
11661 return response
11662
11663 def unvote_on_proposal(
11664 self,
11665 key: Keypair,
11666 proposal_id: int,
11667 ) -> ExtrinsicReceipt:
11668 """
11669 Retracts a previously cast vote on a specified proposal.
11670
11671 Args:
11672 key: The keypair used for signing the unvote transaction.
11673 proposal_id: The unique identifier of the proposal to withdraw the
11674 vote from.
11675
11676 Returns:
11677 A receipt of the unvoting transaction in nanotokens.
11678
11679 Raises:
11680 InvalidProposalIDError: If the provided proposal ID does not
11681 exist or is invalid.
11682 ChainTransactionError: If the transaction fails to be processed, or
11683 if there was no prior vote to retract.
11684 """
11685
11686 params = {"proposal_id": proposal_id}
11687
11688 response = self.compose_call("unvote_proposal", key=key, params=params)
11689
11690 return response
11691
11692 def add_dao_application(
11693 self, key: Keypair, application_key: Ss58Address, data: str
11694 ) -> ExtrinsicReceipt:
11695 """
11696 Submits a new application to the general subnet DAO.
11697
11698 Args:
11699 key: The keypair used for signing the application transaction.
11700 application_key: The SS58 address of the application key.
11701 data: The data associated with the application.
11702
11703 Returns:
11704 A receipt of the application transaction.
11705
11706 Raises:
11707 ChainTransactionError: If the transaction fails.
11708 """
11709
11710 params = {"application_key": application_key, "data": data}
11711
11712 response = self.compose_call(
11713 "add_dao_application", key=key, params=params)
11714
11715 return response
11716
11717 def query_map_curator_applications(self) -> dict[str, dict[str, str]]:
11718 query_result = self.query_map(
11719 "CuratorApplications", params=[], extract_value=False)
11720 applications = query_result.get("CuratorApplications", {})
11721 return applications
11722
11723 def query_map_proposals(
11724 self, extract_value: bool = False
11725 ) -> dict[int, dict[str, Any]]:
11726 """
11727 Retrieves a mappping of proposals from the network.
11728
11729 Queries the network and returns a mapping of proposal IDs to
11730 their respective parameters.
11731
11732 Returns:
11733 A dictionary mapping proposal IDs
11734 to dictionaries of their parameters.
11735
11736 Raises:
11737 QueryError: If the query to the network fails or is invalid.
11738 """
11739
11740 return self.query_map("Proposals", extract_value=extract_value)["Proposals"]
11741
11742 def query_map_weights(
11743 self, netuid: int = 0, extract_value: bool = False
11744 ) -> dict[int, list[int]]:
11745 """
11746 Retrieves a mapping of weights for keys on the network.
11747
11748 Queries the network and returns a mapping of key UIDs to
11749 their respective weights.
11750
11751 Args:
11752 netuid: The network UID from which to get the weights.
11753
11754 Returns:
11755 A dictionary mapping key UIDs to lists of their weights.
11756
11757 Raises:
11758 QueryError: If the query to the network fails or is invalid.
11759 """
11760
11761 return self.query_map("Weights", [netuid], extract_value=extract_value)[
11762 "Weights"
11763 ]
11764
11765 def query_map_key(
11766 self,
11767 netuid: int = 0,
11768 extract_value: bool = False,
11769 ) -> dict[int, Ss58Address]:
11770 """
11771 Retrieves a map of keys from the network.
11772
11773 Fetches a mapping of key UIDs to their associated
11774 addresses on the network.
11775 The query can be targeted at a specific network UID if required.
11776
11777 Args:
11778 netuid: The network UID from which to get the keys.
11779
11780 Returns:
11781 A dictionary mapping key UIDs to their addresses.
11782
11783 Raises:
11784 QueryError: If the query to the network fails or is invalid.
11785 """
11786 return self.query_map("Keys", [netuid], extract_value=extract_value)["Keys"]
11787
11788 def query_map_address(
11789 self, netuid: int = 0, extract_value: bool = False
11790 ) -> dict[int, str]:
11791 """
11792 Retrieves a map of key addresses from the network.
11793
11794 Queries the network for a mapping of key UIDs to their addresses.
11795
11796 Args:
11797 netuid: The network UID from which to get the addresses.
11798
11799 Returns:
11800 A dictionary mapping key UIDs to their addresses.
11801
11802 Raises:
11803 QueryError: If the query to the network fails or is invalid.
11804 """
11805
11806 return self.query_map("Address", [netuid], extract_value=extract_value)[
11807 "Address"
11808 ]
11809
11810 def query_map_emission(self, extract_value: bool = False) -> dict[int, list[int]]:
11811 """
11812 Retrieves a map of emissions for keys on the network.
11813
11814 Queries the network to get a mapping of
11815 key UIDs to their emission values.
11816
11817 Returns:
11818 A dictionary mapping key UIDs to lists of their emission values.
11819
11820 Raises:
11821 QueryError: If the query to the network fails or is invalid.
11822 """
11823
11824 return self.query_map("Emission", extract_value=extract_value)["Emission"]
11825
11826 def query_map_incentive(self, extract_value: bool = False) -> dict[int, list[int]]:
11827 """
11828 Retrieves a mapping of incentives for keys on the network.
11829
11830 Queries the network and returns a mapping of key UIDs to
11831 their respective incentive values.
11832
11833 Returns:
11834 A dictionary mapping key UIDs to lists of their incentive values.
11835
11836 Raises:
11837 QueryError: If the query to the network fails or is invalid.
11838 """
11839
11840 return self.query_map("Incentive", extract_value=extract_value)["Incentive"]
11841
11842 def query_map_dividend(self, extract_value: bool = False) -> dict[int, list[int]]:
11843 """
11844 Retrieves a mapping of dividends for keys on the network.
11845
11846 Queries the network for a mapping of key UIDs to
11847 their dividend values.
11848
11849 Returns:
11850 A dictionary mapping key UIDs to lists of their dividend values.
11851
11852 Raises:
11853 QueryError: If the query to the network fails or is invalid.
11854 """
11855
11856 return self.query_map("Dividends", extract_value=extract_value)["Dividends"]
11857
11858 def query_map_regblock(
11859 self, netuid: int = 0, extract_value: bool = False
11860 ) -> dict[int, int]:
11861 """
11862 Retrieves a mapping of registration blocks for keys on the network.
11863
11864 Queries the network for a mapping of key UIDs to
11865 the blocks where they were registered.
11866
11867 Args:
11868 netuid: The network UID from which to get the registration blocks.
11869
11870 Returns:
11871 A dictionary mapping key UIDs to their registration blocks.
11872
11873 Raises:
11874 QueryError: If the query to the network fails or is invalid.
11875 """
11876
11877 return self.query_map(
11878 "RegistrationBlock", [netuid], extract_value=extract_value
11879 )["RegistrationBlock"]
11880
11881 def query_map_lastupdate(self, extract_value: bool = False) -> dict[int, list[int]]:
11882 """
11883 Retrieves a mapping of the last update times for keys on the network.
11884
11885 Queries the network for a mapping of key UIDs to their last update times.
11886
11887 Returns:
11888 A dictionary mapping key UIDs to lists of their last update times.
11889
11890 Raises:
11891 QueryError: If the query to the network fails or is invalid.
11892 """
11893
11894 return self.query_map("LastUpdate", extract_value=extract_value)["LastUpdate"]
11895
11896 def query_map_total_stake(self, extract_value: bool = False) -> dict[int, int]:
11897 """
11898 Retrieves a mapping of total stakes for keys on the network.
11899
11900 Queries the network for a mapping of key UIDs to their total stake amounts.
11901
11902 Returns:
11903 A dictionary mapping key UIDs to their total stake amounts.
11904
11905 Raises:
11906 QueryError: If the query to the network fails or is invalid.
11907 """
11908
11909 return self.query_map("TotalStake", extract_value=extract_value)["TotalStake"]
11910
11911 def query_map_stakefrom(
11912 self, netuid: int = 0, extract_value: bool = False
11913 ) -> dict[str, list[tuple[str, int]]]:
11914 """
11915 Retrieves a mapping of stakes from various sources for keys on the network.
11916
11917 Queries the network to obtain a mapping of key addresses to the sources
11918 and amounts of stakes they have received.
11919
11920 Args:
11921 netuid: The network UID from which to get the stakes.
11922
11923 Returns:
11924 A dictionary mapping key addresses to lists of tuples
11925 (module_key_address, amount).
11926
11927 Raises:
11928 QueryError: If the query to the network fails or is invalid.
11929 """
11930
11931 return self.query_map("StakeFrom", [netuid], extract_value=extract_value)[
11932 "StakeFrom"
11933 ]
11934
11935 def query_map_staketo(
11936 self, netuid: int = 0, extract_value: bool = False
11937 ) -> dict[str, list[tuple[str, int]]]:
11938 """
11939 Retrieves a mapping of stakes to destinations for keys on the network.
11940
11941 Queries the network for a mapping of key addresses to the destinations
11942 and amounts of stakes they have made.
11943
11944 Args:
11945 netuid: The network UID from which to get the stakes.
11946
11947 Returns:
11948 A dictionary mapping key addresses to lists of tuples
11949 (module_key_address, amount).
11950
11951 Raises:
11952 QueryError: If the query to the network fails or is invalid.
11953 """
11954
11955 return self.query_map("StakeTo", [netuid], extract_value=extract_value)[
11956 "StakeTo"
11957 ]
11958
11959 def query_map_stake(
11960 self, netuid: int = 0, extract_value: bool = False
11961 ) -> dict[str, int]:
11962 """
11963 Retrieves a mapping of stakes for keys on the network.
11964
11965 Queries the network and returns a mapping of key addresses to their
11966 respective delegated staked balances amounts.
11967 The query can be targeted at a specific network UID if required.
11968
11969 Args:
11970 netuid: The network UID from which to get the stakes.
11971
11972 Returns:
11973 A dictionary mapping key addresses to their stake amounts.
11974
11975 Raises:
11976 QueryError: If the query to the network fails or is invalid.
11977 """
11978
11979 return self.query_map("Stake", [netuid], extract_value=extract_value)["Stake"]
11980
11981 def query_map_delegationfee(
11982 self, netuid: int = 0, extract_value: bool = False
11983 ) -> dict[str, int]:
11984 """
11985 Retrieves a mapping of delegation fees for keys on the network.
11986
11987 Queries the network to obtain a mapping of key addresses to their
11988 respective delegation fees.
11989
11990 Args:
11991 netuid: The network UID to filter the delegation fees.
11992
11993 Returns:
11994 A dictionary mapping key addresses to their delegation fees.
11995
11996 Raises:
11997 QueryError: If the query to the network fails or is invalid.
11998 """
11999
12000 return self.query_map("DelegationFee", [netuid], extract_value=extract_value)[
12001 "DelegationFee"
12002 ]
12003
12004 def query_map_tempo(self, extract_value: bool = False) -> dict[int, int]:
12005 """
12006 Retrieves a mapping of tempo settings for the network.
12007
12008 Queries the network to obtain the tempo (rate of reward distributions)
12009 settings for various network subnets.
12010
12011 Returns:
12012 A dictionary mapping network UIDs to their tempo settings.
12013
12014 Raises:
12015 QueryError: If the query to the network fails or is invalid.
12016 """
12017
12018 return self.query_map("Tempo", extract_value=extract_value)["Tempo"]
12019
12020 def query_map_immunity_period(self, extract_value: bool) -> dict[int, int]:
12021 """
12022 Retrieves a mapping of immunity periods for the network.
12023
12024 Queries the network for the immunity period settings,
12025 which represent the time duration during which modules
12026 can not get deregistered.
12027
12028 Returns:
12029 A dictionary mapping network UIDs to their immunity period settings.
12030
12031 Raises:
12032 QueryError: If the query to the network fails or is invalid.
12033 """
12034
12035 return self.query_map("ImmunityPeriod", extract_value=extract_value)[
12036 "ImmunityPeriod"
12037 ]
12038
12039 def query_map_min_allowed_weights(
12040 self, extract_value: bool = False
12041 ) -> dict[int, int]:
12042 """
12043 Retrieves a mapping of minimum allowed weights for the network.
12044
12045 Queries the network to obtain the minimum allowed weights,
12046 which are the lowest permissible weight values that can be set by
12047 validators.
12048
12049 Returns:
12050 A dictionary mapping network UIDs to
12051 their minimum allowed weight values.
12052
12053 Raises:
12054 QueryError: If the query to the network fails or is invalid.
12055 """
12056
12057 return self.query_map("MinAllowedWeights", extract_value=extract_value)[
12058 "MinAllowedWeights"
12059 ]
12060
12061 def query_map_max_allowed_weights(
12062 self, extract_value: bool = False
12063 ) -> dict[int, int]:
12064 """
12065 Retrieves a mapping of maximum allowed weights for the network.
12066
12067 Queries the network for the maximum allowed weights,
12068 which are the highest permissible
12069 weight values that can be set by validators.
12070
12071 Returns:
12072 A dictionary mapping network UIDs to
12073 their maximum allowed weight values.
12074
12075 Raises:
12076 QueryError: If the query to the network fails or is invalid.
12077 """
12078
12079 return self.query_map("MaxAllowedWeights", extract_value=extract_value)[
12080 "MaxAllowedWeights"
12081 ]
12082
12083 def query_map_max_allowed_uids(self, extract_value: bool = False) -> dict[int, int]:
12084 """
12085 Queries the network for the maximum number of allowed user IDs (UIDs)
12086 for each network subnet.
12087
12088 Fetches a mapping of network subnets to their respective
12089 limits on the number of user IDs that can be created or used.
12090
12091 Returns:
12092 A dictionary mapping network UIDs (unique identifiers) to their
12093 maximum allowed number of UIDs.
12094 Each entry represents a network subnet
12095 with its corresponding UID limit.
12096
12097 Raises:
12098 QueryError: If the query to the network fails or is invalid.
12099 """
12100
12101 return self.query_map("MaxAllowedUids", extract_value=extract_value)[
12102 "MaxAllowedUids"
12103 ]
12104
12105 def query_map_min_stake(self, extract_value: bool = False) -> dict[int, int]:
12106 """
12107 Retrieves a mapping of minimum allowed stake on the network.
12108
12109 Queries the network to obtain the minimum number of stake,
12110 which is represented in nanotokens.
12111
12112 Returns:
12113 A dictionary mapping network UIDs to
12114 their minimum allowed stake values.
12115
12116 Raises:
12117 QueryError: If the query to the network fails or is invalid.
12118 """
12119
12120 return self.query_map("MinStake", extract_value=extract_value)["MinStake"]
12121
12122 def query_map_max_stake(self, extract_value: bool = False) -> dict[int, int]:
12123 """
12124 Retrieves a mapping of the maximum stake values for the network.
12125
12126 Queries the network for the maximum stake values across various s
12127 ubnets of the network.
12128
12129 Returns:
12130 A dictionary mapping network UIDs to their maximum stake values.
12131
12132 Raises:
12133 QueryError: If the query to the network fails or is invalid.
12134 """
12135
12136 return self.query_map("MaxStake", extract_value=extract_value)["MaxStake"]
12137
12138 def query_map_founder(self, extract_value: bool = False) -> dict[int, str]:
12139 """
12140 Retrieves a mapping of founders for the network.
12141
12142 Queries the network to obtain the founders associated with
12143 various subnets.
12144
12145 Returns:
12146 A dictionary mapping network UIDs to their respective founders.
12147
12148 Raises:
12149 QueryError: If the query to the network fails or is invalid.
12150 """
12151
12152 return self.query_map("Founder", extract_value=extract_value)["Founder"]
12153
12154 def query_map_founder_share(self, extract_value: bool = False) -> dict[int, int]:
12155 """
12156 Retrieves a mapping of founder shares for the network.
12157
12158 Queries the network for the share percentages
12159 allocated to founders across different subnets.
12160
12161 Returns:
12162 A dictionary mapping network UIDs to their founder share percentages.
12163
12164 Raises:
12165 QueryError: If the query to the network fails or is invalid.
12166 """
12167
12168 return self.query_map("FounderShare", extract_value=extract_value)[
12169 "FounderShare"
12170 ]
12171
12172 def query_map_incentive_ratio(self, extract_value: bool = False) -> dict[int, int]:
12173 """
12174 Retrieves a mapping of incentive ratios for the network.
12175
12176 Queries the network for the incentive ratios,
12177 which are the proportions of rewards or incentives
12178 allocated in different subnets of the network.
12179
12180 Returns:
12181 A dictionary mapping network UIDs to their incentive ratios.
12182
12183 Raises:
12184 QueryError: If the query to the network fails or is invalid.
12185 """
12186
12187 return self.query_map("IncentiveRatio", extract_value=extract_value)[
12188 "IncentiveRatio"
12189 ]
12190
12191 def query_map_trust_ratio(self, extract_value: bool = False) -> dict[int, int]:
12192 """
12193 Retrieves a mapping of trust ratios for the network.
12194
12195 Queries the network for trust ratios,
12196 indicative of the level of trust or credibility assigned
12197 to different subnets of the network.
12198
12199 Returns:
12200 A dictionary mapping network UIDs to their trust ratios.
12201
12202 Raises:
12203 QueryError: If the query to the network fails or is invalid.
12204 """
12205
12206 return self.query_map("TrustRatio", extract_value=extract_value)["TrustRatio"]
12207
12208 def query_map_vote_mode_subnet(self, extract_value: bool = False) -> dict[int, str]:
12209 """
12210 Retrieves a mapping of vote modes for subnets within the network.
12211
12212 Queries the network for the voting modes used in different
12213 subnets, which define the methodology or approach of voting within those
12214 subnets.
12215
12216 Returns:
12217 A dictionary mapping network UIDs to their vote
12218 modes for subnets.
12219
12220 Raises:
12221 QueryError: If the query to the network fails or is invalid.
12222 """
12223
12224 return self.query_map("VoteModeSubnet", extract_value=extract_value)[
12225 "VoteModeSubnet"
12226 ]
12227
12228 def query_map_legit_whitelist(
12229 self, extract_value: bool = False
12230 ) -> dict[Ss58Address, int]:
12231 """
12232 Retrieves a mapping of whitelisted addresses for the network.
12233
12234 Queries the network for a mapping of whitelisted addresses
12235 and their respective legitimacy status.
12236
12237 Returns:
12238 A dictionary mapping addresses to their legitimacy status.
12239
12240 Raises:
12241 QueryError: If the query to the network fails or is invalid.
12242 """
12243
12244 return self.query_map("LegitWhitelist", extract_value=extract_value)[
12245 "LegitWhitelist"
12246 ]
12247
12248 def query_map_subnet_names(self, extract_value: bool = False) -> dict[int, str]:
12249 """
12250 Retrieves a mapping of subnet names within the network.
12251
12252 Queries the network for the names of various subnets,
12253 providing an overview of the different
12254 subnets within the network.
12255
12256 Returns:
12257 A dictionary mapping network UIDs to their subnet names.
12258
12259 Raises:
12260 QueryError: If the query to the network fails or is invalid.
12261 """
12262
12263 return self.query_map("SubnetNames", extract_value=extract_value)["SubnetNames"]
12264
12265 def query_map_balances(
12266 self, extract_value: bool = False
12267 ) -> dict[str, dict["str", int | dict[str, int]]]:
12268 """
12269 Retrieves a mapping of account balances within the network.
12270
12271 Queries the network for the balances associated with different accounts.
12272 It provides detailed information including various types of
12273 balances for each account.
12274
12275 Returns:
12276 A dictionary mapping account addresses to their balance details.
12277
12278 Raises:
12279 QueryError: If the query to the network fails or is invalid.
12280 """
12281
12282 return self.query_map("Account", module="System", extract_value=extract_value)[
12283 "Account"
12284 ]
12285
12286 def query_map_registration_blocks(
12287 self, netuid: int = 0, extract_value: bool = False
12288 ) -> dict[int, int]:
12289 """
12290 Retrieves a mapping of registration blocks for UIDs on the network.
12291
12292 Queries the network to find the block numbers at which various
12293 UIDs were registered.
12294
12295 Args:
12296 netuid: The network UID from which to get the registrations.
12297
12298 Returns:
12299 A dictionary mapping UIDs to their registration block numbers.
12300
12301 Raises:
12302 QueryError: If the query to the network fails or is invalid.
12303 """
12304
12305 return self.query_map(
12306 "RegistrationBlock", [netuid], extract_value=extract_value
12307 )["RegistrationBlock"]
12308
12309 def query_map_name(
12310 self, netuid: int = 0, extract_value: bool = False
12311 ) -> dict[int, str]:
12312 """
12313 Retrieves a mapping of names for keys on the network.
12314
12315 Queries the network for the names associated with different keys.
12316 It provides a mapping of key UIDs to their registered names.
12317
12318 Args:
12319 netuid: The network UID from which to get the names.
12320
12321 Returns:
12322 A dictionary mapping key UIDs to their names.
12323
12324 Raises:
12325 QueryError: If the query to the network fails or is invalid.
12326 """
12327
12328 return self.query_map("Name", [netuid], extract_value=extract_value)["Name"]
12329
12330 # == QUERY FUNCTIONS == #
12331
12332 def get_immunity_period(self, netuid: int = 0) -> int:
12333 """
12334 Queries the network for the immunity period setting.
12335
12336 The immunity period is a time duration during which a module
12337 can not be deregistered from the network.
12338 Fetches the immunity period for a specified network subnet.
12339
12340 Args:
12341 netuid: The network UID for which to query the immunity period.
12342
12343 Returns:
12344 The immunity period setting for the specified network subnet.
12345
12346 Raises:
12347 QueryError: If the query to the network fails or is invalid.
12348 """
12349
12350 return self.query(
12351 "ImmunityPeriod",
12352 params=[netuid],
12353 )
12354
12355 def get_max_set_weights_per_epoch(self):
12356 return self.query("MaximumSetWeightCallsPerEpoch")
12357
12358 def get_min_allowed_weights(self, netuid: int = 0) -> int:
12359 """
12360 Queries the network for the minimum allowed weights setting.
12361
12362 Retrieves the minimum weight values that are possible to set
12363 by a validator within a specific network subnet.
12364
12365 Args:
12366 netuid: The network UID for which to query the minimum allowed
12367 weights.
12368
12369 Returns:
12370 The minimum allowed weight values for the specified network
12371 subnet.
12372
12373 Raises:
12374 QueryError: If the query to the network fails or is invalid.
12375 """
12376
12377 return self.query(
12378 "MinAllowedWeights",
12379 params=[netuid],
12380 )
12381
12382 def get_max_allowed_weights(self, netuid: int = 0) -> int:
12383 """
12384 Queries the network for the maximum allowed weights setting.
12385
12386 Retrieves the maximum weight values that are possible to set
12387 by a validator within a specific network subnet.
12388
12389 Args:
12390 netuid: The network UID for which to query the maximum allowed
12391 weights.
12392
12393 Returns:
12394 The maximum allowed weight values for the specified network
12395 subnet.
12396
12397 Raises:
12398 QueryError: If the query to the network fails or is invalid.
12399 """
12400
12401 return self.query("MaxAllowedWeights", params=[netuid])
12402
12403 def get_max_allowed_uids(self, netuid: int = 0) -> int:
12404 """
12405 Queries the network for the maximum allowed UIDs setting.
12406
12407 Fetches the upper limit on the number of user IDs that can
12408 be allocated or used within a specific network subnet.
12409
12410 Args:
12411 netuid: The network UID for which to query the maximum allowed UIDs.
12412
12413 Returns:
12414 The maximum number of allowed UIDs for the specified network subnet.
12415
12416 Raises:
12417 QueryError: If the query to the network fails or is invalid.
12418 """
12419
12420 return self.query("MaxAllowedUids", params=[netuid])
12421
12422 def get_name(self, netuid: int = 0) -> str:
12423 """
12424 Queries the network for the name of a specific subnet.
12425
12426 Args:
12427 netuid: The network UID for which to query the name.
12428
12429 Returns:
12430 The name of the specified network subnet.
12431
12432 Raises:
12433 QueryError: If the query to the network fails or is invalid.
12434 """
12435
12436 return self.query("Name", params=[netuid])
12437
12438 def get_subnet_name(self, netuid: int = 0) -> str:
12439 """
12440 Queries the network for the name of a specific subnet.
12441
12442 Args:
12443 netuid: The network UID for which to query the name.
12444
12445 Returns:
12446 The name of the specified network subnet.
12447
12448 Raises:
12449 QueryError: If the query to the network fails or is invalid.
12450 """
12451
12452 return self.query("SubnetNames", params=[netuid])
12453
12454 def get_global_dao_treasury(self):
12455 return self.query("GlobalDaoTreasury")
12456
12457 def get_n(self, netuid: int = 0) -> int:
12458 """
12459 Queries the network for the 'N' hyperparameter, which represents how
12460 many modules are on the network.
12461
12462 Args:
12463 netuid: The network UID for which to query the 'N' hyperparameter.
12464
12465 Returns:
12466 The value of the 'N' hyperparameter for the specified network
12467 subnet.
12468
12469 Raises:
12470 QueryError: If the query to the network fails or is invalid.
12471 """
12472
12473 return self.query("N", params=[netuid])
12474
12475 def get_tempo(self, netuid: int = 0) -> int:
12476 """
12477 Queries the network for the tempo setting, measured in blocks, for the
12478 specified subnet.
12479
12480 Args:
12481 netuid: The network UID for which to query the tempo.
12482
12483 Returns:
12484 The tempo setting for the specified subnet.
12485
12486 Raises:
12487 QueryError: If the query to the network fails or is invalid.
12488 """
12489
12490 return self.query("Tempo", params=[netuid])
12491
12492 def get_total_stake(self, netuid: int = 0):
12493 """
12494 Queries the network for the total stake amount.
12495
12496 Retrieves the total amount of stake within a specific network subnet.
12497
12498 Args:
12499 netuid: The network UID for which to query the total stake.
12500
12501 Returns:
12502 The total stake amount for the specified network subnet.
12503
12504 Raises:
12505 QueryError: If the query to the network fails or is invalid.
12506 """
12507
12508 return self.query(
12509 "TotalStake",
12510 params=[netuid],
12511 )
12512
12513 def get_registrations_per_block(self):
12514 """
12515 Queries the network for the number of registrations per block.
12516
12517 Fetches the number of registrations that are processed per
12518 block within the network.
12519
12520 Returns:
12521 The number of registrations processed per block.
12522
12523 Raises:
12524 QueryError: If the query to the network fails or is invalid.
12525 """
12526
12527 return self.query(
12528 "RegistrationsPerBlock",
12529 )
12530
12531 def max_registrations_per_block(self, netuid: int = 0):
12532 """
12533 Queries the network for the maximum number of registrations per block.
12534
12535 Retrieves the upper limit of registrations that can be processed in
12536 each block within a specific network subnet.
12537
12538 Args:
12539 netuid: The network UID for which to query.
12540
12541 Returns:
12542 The maximum number of registrations per block for
12543 the specified network subnet.
12544
12545 Raises:
12546 QueryError: If the query to the network fails or is invalid.
12547 """
12548
12549 return self.query(
12550 "MaxRegistrationsPerBlock",
12551 params=[netuid],
12552 )
12553
12554 def get_proposal(self, proposal_id: int = 0):
12555 """
12556 Queries the network for a specific proposal.
12557
12558 Args:
12559 proposal_id: The ID of the proposal to query.
12560
12561 Returns:
12562 The details of the specified proposal.
12563
12564 Raises:
12565 QueryError: If the query to the network fails, is invalid,
12566 or if the proposal ID does not exist.
12567 """
12568
12569 return self.query(
12570 "Proposals",
12571 params=[proposal_id],
12572 )
12573
12574 def get_trust(self, netuid: int = 0):
12575 """
12576 Queries the network for the trust setting of a specific network subnet.
12577
12578 Retrieves the trust level or score, which may represent the
12579 level of trustworthiness or reliability within a
12580 particular network subnet.
12581
12582 Args:
12583 netuid: The network UID for which to query the trust setting.
12584
12585 Returns:
12586 The trust level or score for the specified network subnet.
12587
12588 Raises:
12589 QueryError: If the query to the network fails or is invalid.
12590 """
12591
12592 return self.query(
12593 "Trust",
12594 params=[netuid],
12595 )
12596
12597 def get_uids(self, key: Ss58Address, netuid: int = 0) -> bool | None:
12598 """
12599 Queries the network for module UIDs associated with a specific key.
12600
12601 Args:
12602 key: The key address for which to query UIDs.
12603 netuid: The network UID within which to search for the key.
12604
12605 Returns:
12606 A list of UIDs associated with the specified key.
12607
12608 Raises:
12609 QueryError: If the query to the network fails or is invalid.
12610 """
12611
12612 return self.query(
12613 "Uids",
12614 params=[netuid, key],
12615 )
12616
12617 def get_unit_emission(self) -> int:
12618 """
12619 Queries the network for the unit emission setting.
12620
12621 Retrieves the unit emission value, which represents the
12622 emission rate or quantity for the $COMAI token.
12623
12624 Returns:
12625 The unit emission value in nanos for the network.
12626
12627 Raises:
12628 QueryError: If the query to the network fails or is invalid.
12629 """
12630
12631 return self.query("UnitEmission")
12632
12633 def get_tx_rate_limit(self) -> int:
12634 """
12635 Queries the network for the transaction rate limit.
12636
12637 Retrieves the rate limit for transactions within the network,
12638 which defines the maximum number of transactions that can be
12639 processed within a certain timeframe.
12640
12641 Returns:
12642 The transaction rate limit for the network.
12643
12644 Raises:
12645 QueryError: If the query to the network fails or is invalid.
12646 """
12647
12648 return self.query(
12649 "TxRateLimit",
12650 )
12651
12652 def get_burn_rate(self) -> int:
12653 """
12654 Queries the network for the burn rate setting.
12655
12656 Retrieves the burn rate, which represents the rate at
12657 which the $COMAI token is permanently
12658 removed or 'burned' from circulation.
12659
12660 Returns:
12661 The burn rate for the network.
12662
12663 Raises:
12664 QueryError: If the query to the network fails or is invalid.
12665 """
12666
12667 return self.query(
12668 "BurnRate",
12669 params=[],
12670 )
12671
12672 def get_burn(self, netuid: int = 0) -> int:
12673 """
12674 Queries the network for the burn setting.
12675
12676 Retrieves the burn value, which represents the amount of the
12677 $COMAI token that is 'burned' or permanently removed from
12678 circulation.
12679
12680 Args:
12681 netuid: The network UID for which to query the burn value.
12682
12683 Returns:
12684 The burn value for the specified network subnet.
12685
12686 Raises:
12687 QueryError: If the query to the network fails or is invalid.
12688 """
12689
12690 return self.query("Burn", params=[netuid])
12691
12692 def get_min_burn(self) -> int:
12693 """
12694 Queries the network for the minimum burn setting.
12695
12696 Retrieves the minimum burn value, indicating the lowest
12697 amount of the $COMAI tokens that can be 'burned' or
12698 permanently removed from circulation.
12699
12700 Returns:
12701 The minimum burn value for the network.
12702
12703 Raises:
12704 QueryError: If the query to the network fails or is invalid.
12705 """
12706
12707 return self.query(
12708 "MinBurn",
12709 params=[],
12710 )
12711
12712 def get_min_weight_stake(self) -> int:
12713 """
12714 Queries the network for the minimum weight stake setting.
12715
12716 Retrieves the minimum weight stake, which represents the lowest
12717 stake weight that is allowed for certain operations or
12718 transactions within the network.
12719
12720 Returns:
12721 The minimum weight stake for the network.
12722
12723 Raises:
12724 QueryError: If the query to the network fails or is invalid.
12725 """
12726
12727 return self.query("MinWeightStake", params=[])
12728
12729 def get_vote_mode_global(self) -> str:
12730 """
12731 Queries the network for the global vote mode setting.
12732
12733 Retrieves the global vote mode, which defines the overall voting
12734 methodology or approach used across the network in default.
12735
12736 Returns:
12737 The global vote mode setting for the network.
12738
12739 Raises:
12740 QueryError: If the query to the network fails or is invalid.
12741 """
12742
12743 return self.query(
12744 "VoteModeGlobal",
12745 )
12746
12747 def get_max_proposals(self) -> int:
12748 """
12749 Queries the network for the maximum number of proposals allowed.
12750
12751 Retrieves the upper limit on the number of proposals that can be
12752 active or considered at any given time within the network.
12753
12754 Returns:
12755 The maximum number of proposals allowed on the network.
12756
12757 Raises:
12758 QueryError: If the query to the network fails or is invalid.
12759 """
12760
12761 return self.query(
12762 "MaxProposals",
12763 )
12764
12765 def get_max_registrations_per_block(self) -> int:
12766 """
12767 Queries the network for the maximum number of registrations per block.
12768
12769 Retrieves the maximum number of registrations that can
12770 be processed in each block within the network.
12771
12772 Returns:
12773 The maximum number of registrations per block on the network.
12774
12775 Raises:
12776 QueryError: If the query to the network fails or is invalid.
12777 """
12778
12779 return self.query(
12780 "MaxRegistrationsPerBlock",
12781 params=[],
12782 )
12783
12784 def get_max_name_length(self) -> int:
12785 """
12786 Queries the network for the maximum length allowed for names.
12787
12788 Retrieves the maximum character length permitted for names
12789 within the network. Such as the module names
12790
12791 Returns:
12792 The maximum length allowed for names on the network.
12793
12794 Raises:
12795 QueryError: If the query to the network fails or is invalid.
12796 """
12797
12798 return self.query(
12799 "MaxNameLength",
12800 params=[],
12801 )
12802
12803 def get_global_vote_threshold(self) -> int:
12804 """
12805 Queries the network for the global vote threshold.
12806
12807 Retrieves the global vote threshold, which is the critical value or
12808 percentage required for decisions in the network's governance process.
12809
12810 Returns:
12811 The global vote threshold for the network.
12812
12813 Raises:
12814 QueryError: If the query to the network fails or is invalid.
12815 """
12816
12817 return self.query(
12818 "GlobalVoteThreshold",
12819 )
12820
12821 def get_max_allowed_subnets(self) -> int:
12822 """
12823 Queries the network for the maximum number of allowed subnets.
12824
12825 Retrieves the upper limit on the number of subnets that can
12826 be created or operated within the network.
12827
12828 Returns:
12829 The maximum number of allowed subnets on the network.
12830
12831 Raises:
12832 QueryError: If the query to the network fails or is invalid.
12833 """
12834
12835 return self.query(
12836 "MaxAllowedSubnets",
12837 params=[],
12838 )
12839
12840 def get_max_allowed_modules(self) -> int:
12841 """
12842 Queries the network for the maximum number of allowed modules.
12843
12844 Retrieves the upper limit on the number of modules that
12845 can be registered within the network.
12846
12847 Returns:
12848 The maximum number of allowed modules on the network.
12849
12850 Raises:
12851 QueryError: If the query to the network fails or is invalid.
12852 """
12853
12854 return self.query(
12855 "MaxAllowedModules",
12856 params=[],
12857 )
12858
12859 def get_min_stake(self, netuid: int = 0) -> int:
12860 """
12861 Queries the network for the minimum stake required to register a key.
12862
12863 Retrieves the minimum amount of stake necessary for
12864 registering a key within a specific network subnet.
12865
12866 Args:
12867 netuid: The network UID for which to query the minimum stake.
12868
12869 Returns:
12870 The minimum stake required for key registration in nanos.
12871
12872 Raises:
12873 QueryError: If the query to the network fails or is invalid.
12874 """
12875
12876 return self.query("MinStake", params=[netuid])
12877
12878 def get_stake(
12879 self,
12880 key: Ss58Address,
12881 netuid: int = 0,
12882 ) -> int:
12883 """
12884 Queries the network for the stake delegated with a specific key.
12885
12886 Retrieves the amount of total staked tokens
12887 delegated a specific key address
12888
12889 Args:
12890 key: The address of the key to query the stake for.
12891 netuid: The network UID from which to get the query.
12892
12893 Returns:
12894 The amount of stake held by the specified key in nanos.
12895
12896 Raises:
12897 QueryError: If the query to the network fails or is invalid.
12898 """
12899
12900 return self.query(
12901 "Stake",
12902 params=[netuid, key],
12903 )
12904
12905 def get_stakefrom(
12906 self,
12907 key_addr: Ss58Address,
12908 netuid: int = 0,
12909 ) -> dict[str, int]:
12910 """
12911 Retrieves a list of keys from which a specific key address is staked.
12912
12913 Queries the network for all the stakes received by a
12914 particular key from different sources.
12915
12916 Args:
12917 key_addr: The address of the key to query stakes from.
12918
12919 netuid: The network UID from which to get the query.
12920
12921 Returns:
12922 A dictionary mapping key addresses to the amount of stake
12923 received from each.
12924
12925 Raises:
12926 QueryError: If the query to the network fails or is invalid.
12927 """
12928 result = self.query("StakeFrom", [netuid, key_addr])
12929
12930 return {k: v for k, v in result}
12931
12932 def get_staketo(
12933 self,
12934 key_addr: Ss58Address,
12935 netuid: int = 0,
12936 ) -> dict[str, int]:
12937 """
12938 Retrieves a list of keys to which a specific key address stakes to.
12939
12940 Queries the network for all the stakes made by a particular key to
12941 different destinations.
12942
12943 Args:
12944 key_addr: The address of the key to query stakes to.
12945
12946 netuid: The network UID from which to get the query.
12947
12948 Returns:
12949 A dictionary mapping key addresses to the
12950 amount of stake given to each.
12951
12952 Raises:
12953 QueryError: If the query to the network fails or is invalid.
12954 """
12955
12956 result = self.query("StakeTo", [netuid, key_addr])
12957
12958 return {k: v for k, v in result}
12959
12960 def get_balance(
12961 self,
12962 addr: Ss58Address,
12963 ) -> int:
12964 """
12965 Retrieves the balance of a specific key.
12966
12967 Args:
12968 addr: The address of the key to query the balance for.
12969
12970 Returns:
12971 The balance of the specified key.
12972
12973 Raises:
12974 QueryError: If the query to the network fails or is invalid.
12975 """
12976
12977 result = self.query("Account", module="System", params=[addr])
12978
12979 return result["data"]["free"]
12980
12981 def get_block(self, block_hash: str | None = None) -> dict[Any, Any] | None:
12982 """
12983 Retrieves information about a specific block in the network.
12984
12985 Queries the network for details about a block, such as its number,
12986 hash, and other relevant information.
12987
12988 Returns:
12989 The requested information about the block,
12990 or None if the block does not exist
12991 or the information is not available.
12992
12993 Raises:
12994 QueryError: If the query to the network fails or is invalid.
12995 """
12996
12997 with self.get_conn() as substrate:
12998 block: dict[Any, Any] | None = substrate.get_block( # type: ignore
12999 block_hash # type: ignore
13000 )
13001
13002 return block
13003
13004 def get_existential_deposit(self, block_hash: str | None = None) -> int:
13005 """
13006 Retrieves the existential deposit value for the network.
13007
13008 The existential deposit is the minimum balance that must be maintained
13009 in an account to prevent it from being purged. Denotated in nano units.
13010
13011 Returns:
13012 The existential deposit value in nano units.
13013 Note:
13014 The value returned is a fixed value defined in the
13015 client and may not reflect changes in the network's configuration.
13016 """
13017
13018 with self.get_conn() as substrate:
13019 result: int = substrate.get_constant( # type: ignore
13020 "Balances", "ExistentialDeposit", block_hash
13021 ).value # type: ignore
13022
13023 return result
13024
13025
13026---
13027File: /validator-api/validator_api/communex/errors.py
13028---
13029
13030class ChainTransactionError(Exception):
13031 """Error for any chain transaction related errors."""
13032
13033
13034class NetworkError(BaseException):
13035 """Base for any network related errors."""
13036
13037
13038class NetworkQueryError(NetworkError):
13039 """Network query related error."""
13040
13041
13042class NetworkTimeoutError(NetworkError):
13043 """Timeout error"""
13044
13045
13046---
13047File: /validator-api/validator_api/communex/key.py
13048---
13049
13050from typing import TypeGuard
13051
13052from substrateinterface import Keypair # type: ignore
13053from substrateinterface.utils import ss58 # type: ignore
13054
13055from validator_api.communex.types import Ss58Address
13056
13057
13058def is_ss58_address(address: str, ss58_format: int = 42) -> TypeGuard[Ss58Address]:
13059 """
13060 Validates whether the given string is a valid SS58 address.
13061
13062 Args:
13063 address: The string to validate.
13064 ss58_format: The SS58 format code to validate against.
13065
13066 Returns:
13067 True if the address is valid, False otherwise.
13068 """
13069
13070 return ss58.is_valid_ss58_address(address, valid_ss58_format=ss58_format)
13071
13072
13073def check_ss58_address(address: str | Ss58Address, ss58_format: int = 42) -> Ss58Address:
13074 """
13075 Validates whether the given string is a valid SS58 address.
13076
13077 Args:
13078 address: The string to validate.
13079 ss58_format: The SS58 format code to validate against.
13080
13081 Returns:
13082 The validated SS58 address.
13083
13084 Raises:
13085 AssertionError: If the address is invalid.
13086 """
13087
13088 assert is_ss58_address(
13089 address, ss58_format), f"Invalid SS58 address '{address}'"
13090 return Ss58Address(address)
13091
13092
13093def generate_keypair() -> Keypair:
13094 """
13095 Generates a new keypair.
13096 """
13097 mnemonic = Keypair.generate_mnemonic()
13098 keypair = Keypair.create_from_mnemonic(mnemonic)
13099 return keypair
13100
13101
13102---
13103File: /validator-api/validator_api/communex/types.py
13104---
13105
13106"""
13107Common types for the communex module.
13108"""
13109
13110from typing import NewType, TypedDict
13111
13112Ss58Address = NewType("Ss58Address", str)
13113"""Substrate SS58 address.
13114
13115The `SS58 encoded address format`_ is based on the Bitcoin Base58Check format,
13116but with a few modification specifically designed to suite Substrate-based
13117chains.
13118
13119.. _SS58 encoded address format:
13120 https://docs.substrate.io/reference/address-formats/
13121"""
13122
13123
13124# TODO: replace with dataclasses
13125
13126
13127class NetworkParams(TypedDict):
13128 max_allowed_modules: int
13129 max_registrations_per_block: int
13130 target_registrations_interval: int # in blocks
13131 target_registrations_per_interval: int
13132 unit_emission: int
13133 max_name_length: int
13134 min_name_length: int
13135 burn_rate: int
13136 min_burn: int # min burn to register
13137 max_burn: int # max burn to register
13138 min_stake: int
13139 min_weight_stake: int
13140 max_allowed_subnets: int
13141 adjustment_alpha: int
13142 floor_delegation_fee: int
13143 max_allowed_weights: int
13144 curator: Ss58Address
13145 proposal_cost: int
13146 proposal_expiration: int
13147 proposal_participation_threshold: int
13148 subnet_stake_threshold: int
13149
13150
13151class SubnetParams(TypedDict):
13152 founder: Ss58Address
13153 founder_share: int
13154 immunity_period: int
13155 incentive_ratio: int
13156 max_allowed_uids: int
13157 max_allowed_weights: int
13158 min_allowed_weights: int
13159 max_stake: int
13160 max_weight_age: int
13161 min_stake: int
13162 name: str
13163 tempo: int
13164 trust_ratio: int
13165 vote_mode: str
13166 bonds_ma: int | None
13167 maximum_set_weight_calls_per_epoch: int | None
13168
13169
13170# redundant "TypedDict" inheritance because of pdoc warns.
13171# see https://github.com/mitmproxy/pdoc/blob/26d40827ddbe1658e8ac46cd092f17a44cf0287b/pdoc/doc.py#L691-L692
13172class SubnetParamsWithEmission(SubnetParams, TypedDict):
13173 """SubnetParams with emission field.
13174 """
13175 emission: int
13176 """Subnet emission percentage (0-100).
13177 """
13178
13179
13180class ModuleInfo(TypedDict):
13181 uid: int
13182 key: Ss58Address
13183 name: str
13184 address: str # "<ip>:<port>"
13185 emission: int
13186 incentive: int
13187 dividends: int
13188 stake_from: list[tuple[str, int]] # TODO: type key with Ss58Address
13189 regblock: int # block number
13190 last_update: int # block number
13191 stake: int
13192 delegation_fee: int
13193 metadata: str
13194
13195
13196class ModuleInfoWithBalance(ModuleInfo):
13197 balance: int
13198
13199
13200class ModuleInfoWithOptionalBalance(ModuleInfo):
13201 balance: int | None
13202
13203
13204---
13205File: /validator-api/validator_api/cron/confirm_purchase.py
13206---
13207
13208import asyncio
13209import time
13210from datetime import datetime
13211import validator_api.config as config
13212
13213from sqlalchemy.orm import Session
13214
13215from validator_api.database import get_db_context
13216from validator_api.database.models.focus_video_record import FocusVideoRecord, FocusVideoStateInternal
13217
13218import bittensor as bt
13219
13220from validator_api.utils.wallet import get_transaction_from_block_hash
13221
13222def extrinsic_already_confirmed(db: Session, extrinsic_id: str) -> bool:
13223 record = db.query(FocusVideoRecord).filter(FocusVideoRecord.extrinsic_id == extrinsic_id)
13224 return record.first() is not None
13225
13226async def check_payment(db: Session, recipient_address: str, sender_address: str, amount: float, block_hash: str = None):
13227 try:
13228 print(f"Checking payment of {amount} from {sender_address} to {recipient_address}")
13229
13230 sub = bt.subtensor(network=config.NETWORK)
13231
13232 # Get all transfers associated with the recipient address
13233 transfers = await get_transaction_from_block_hash(sub, recipient_address, block_hash)
13234
13235 # Filter transfers to find the specific payment
13236 for transfer in transfers:
13237 if (
13238 transfer["from"] == sender_address and
13239 transfer["to"] == recipient_address and
13240 round(float(transfer["amount"]), 5) == round(amount, 5)
13241 ):
13242 if extrinsic_already_confirmed(db, transfer["extrinsicId"]):
13243 continue
13244 print(f"Payment of {amount} found from {sender_address} to {recipient_address}")
13245 return transfer["extrinsicId"]
13246
13247 print(f"Payment of {amount} not found from {sender_address} to {recipient_address}")
13248 return None
13249
13250 except Exception as e:
13251 print(f'Error in checking payment: {e}')
13252 return None
13253
13254 finally:
13255 sub.close()
13256
13257SUBTENSOR_RETRIES = 5
13258SUBTENSOR_DELAY_SECS = 2
13259
13260async def confirm_transfer(
13261 db: Session,
13262 video_owner_coldkey: str,
13263 video_id: str,
13264 miner_hotkey: str,
13265 block_hash: str = None,
13266 with_lock: bool = False
13267):
13268 subtensor = bt.subtensor(network=config.NETWORK)
13269
13270 video = db.query(FocusVideoRecord).filter(
13271 FocusVideoRecord.video_id == video_id,
13272 FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASE_PENDING,
13273 FocusVideoRecord.miner_hotkey == miner_hotkey,
13274 FocusVideoRecord.deleted_at.is_(None),
13275 )
13276 if with_lock:
13277 video = video.with_for_update()
13278 video = video.first()
13279
13280 if not video:
13281 print(f"Video <{video_id}> not found")
13282 return False
13283
13284 amount = video.expected_reward_tao
13285
13286 current_time = datetime.utcnow()
13287 print(f"[{current_time}] | Scanning block hash <{block_hash}> for address <{video_owner_coldkey}> payment transaction from ...")
13288 for attempt in range(SUBTENSOR_RETRIES):
13289 try:
13290 miner_coldkey = subtensor.get_hotkey_owner(miner_hotkey)
13291 print(f"Miner coldkey: {miner_coldkey}")
13292
13293 extrinsic_id = await check_payment(db, video_owner_coldkey, miner_coldkey, amount, block_hash)
13294 if extrinsic_id is not None:
13295 print(f"Miner <{miner_hotkey}> successfully purchased focus recording <{video_id}>!")
13296 video.miner_hotkey = miner_hotkey
13297 video.processing_state = FocusVideoStateInternal.PURCHASED
13298 video.updated_at = datetime.utcnow()
13299 video.extrinsic_id = extrinsic_id
13300 video.earned_reward_tao = amount
13301 db.add(video)
13302 db.commit()
13303 return True
13304
13305 except Exception as e:
13306 if attempt < SUBTENSOR_RETRIES - 1: # if it's not the last attempt
13307 if "Broken pipe" in str(e) or "EOF occurred in violation of protocol" in str(e) or "[SSL: BAD_LENGTH]" in str(e):
13308 print(f"Connection to subtensor was lost. Re-initializing subtensor and retrying in {SUBTENSOR_DELAY_SECS} seconds...")
13309 subtensor = bt.subtensor(network=config.NETWORK)
13310 await asyncio.sleep(SUBTENSOR_DELAY_SECS)
13311 else:
13312 print(f"Attempt #{attempt + 1} to sub.get_hotkey_owner() and check_payment() failed. Retrying in {SUBTENSOR_DELAY_SECS} seconds...")
13313 print(f"Error: {str(e)}")
13314 await asyncio.sleep(SUBTENSOR_DELAY_SECS)
13315 else:
13316 print(f"All {SUBTENSOR_RETRIES} attempts failed. Unable to retrieve miner coldkey and confirm payment.")
13317 print(f"Final error: {str(e)}")
13318 return False
13319 # we got here because we could not confirm the payment. Let's return false to let the miner know
13320 return False
13321
13322
13323DELAY_SECS = 30 # 30s
13324RETRIES = 6 # 30s x 10 retries = 180s = 3 mins
13325
13326async def confirm_video_purchased(
13327 video_id: str,
13328 with_lock: bool = False
13329):
13330 """
13331 The purpose of this function is to set the video back to the SUBMITTED state
13332 if the miner has not confirmed the purchase in time.
13333 """
13334
13335 current_time = datetime.utcnow()
13336 print(f"BACKGROUND TASK | {current_time} | Checking if video_id <{video_id}> has been marked as purchased or reverted back to SUBMITTED ...")
13337 try:
13338 for i in range(0, RETRIES):
13339 await asyncio.sleep(DELAY_SECS)
13340 try:
13341 with get_db_context() as db:
13342 video = db.query(FocusVideoRecord).filter(
13343 FocusVideoRecord.video_id == video_id,
13344 FocusVideoRecord.deleted_at.is_(None),
13345 )
13346 if with_lock:
13347 video = video.with_for_update()
13348 video = video.first()
13349
13350 if not video:
13351 print(f"Video <{video_id}> not found")
13352 return False
13353
13354 if video is not None and video.processing_state == FocusVideoStateInternal.PURCHASED:
13355 print(f"Video <{video_id}> has been marked as PURCHASED. Stopping background task.")
13356 return True
13357 elif video is not None and video.processing_state == FocusVideoStateInternal.SUBMITTED:
13358 print(f"Video <{video_id}> has been marked as SUBMITTED. Stopping background task.")
13359 return True
13360
13361 print(f"Video <{video_id}> has NOT been marked as PURCHASED. Retrying in {DELAY_SECS} seconds...")
13362 # close the db connection until next retry
13363 db.close()
13364
13365 except Exception as e:
13366 print(f"Error in checking confirm_video_purchased loop: {e}")
13367
13368 # we got here because we could not confirm the payment in time, so we need to revert
13369 # the video back to the SUBMITTED state (i.e. mark available for purchase)
13370 print(f"Video <{video_id}> has NOT been marked as PURCHASED. Reverting to SUBMITTED state...")
13371 video.processing_state = FocusVideoStateInternal.SUBMITTED
13372 video.updated_at = datetime.utcnow()
13373 db.add(video)
13374 db.commit()
13375 db.close()
13376 return False
13377
13378 except Exception as e:
13379 print(f"Error in confirm_video_purchased: {e}")
13380
13381 return False
13382
13383
13384
13385---
13386File: /validator-api/validator_api/database/crud/focusvideo.py
13387---
13388
13389from datetime import datetime, timedelta
13390from fastapi import HTTPException
13391from sqlalchemy.orm import Session, joinedload
13392from sqlalchemy import func, Float
13393from typing import List, Optional, Dict
13394import json
13395import time
13396import asyncio
13397
13398from validator_api.database import get_db_context
13399from validator_api.database.models.focus_video_record import FocusVideoRecord, FocusVideoInternal, FocusVideoStateInternal, TaskType
13400from validator_api.database.models.user import UserRecord
13401from validator_api.utils.marketplace import get_max_focus_tao, get_purchase_max_focus_tao, get_max_focus_points_available_today
13402from pydantic import BaseModel
13403from validator_api.services.scoring_service import VideoScore, FocusVideoEmbeddings
13404
13405
13406MIN_REWARD_TAO = 0.001
13407
13408
13409class CachedValue:
13410 def __init__(self, duration: int = 90):
13411 self._value = None
13412 self._timestamp = 0
13413 self._duration = duration
13414 self._mutex = asyncio.Lock()
13415
13416 def is_valid(self) -> bool:
13417 return (
13418 self._value is not None and
13419 time.time() - self._timestamp < self._duration
13420 )
13421
13422 async def get_or_update(self, fetch_func):
13423 if self.is_valid():
13424 return self._value
13425
13426 try:
13427 async with self._mutex:
13428 # Double check after acquiring lock
13429 if not self.is_valid():
13430 self._value = await fetch_func()
13431 self._timestamp = time.time()
13432 return self._value
13433
13434 except Exception as e:
13435 print(e)
13436 raise HTTPException(500, detail="Internal error")
13437
13438async def _fetch_available_focus(db: Session):
13439 # Show oldest videos first so they get rewarded fastest
13440 items = db.query(FocusVideoRecord).filter(
13441 FocusVideoRecord.processing_state == FocusVideoStateInternal.SUBMITTED,
13442 FocusVideoRecord.deleted_at.is_(None),
13443 FocusVideoRecord.expected_reward_tao > MIN_REWARD_TAO,
13444 ).order_by(FocusVideoRecord.updated_at.asc()).limit(10).all()
13445 return [FocusVideoInternal.model_validate(record) for record in items]
13446
13447_available_focus_cache = CachedValue()
13448
13449async def get_all_available_focus(db: Session):
13450 return await _available_focus_cache.get_or_update(
13451 lambda: _fetch_available_focus(db)
13452 )
13453
13454def get_pending_focus(
13455 db: Session,
13456 miner_hotkey: str
13457):
13458 try:
13459 items = db.query(FocusVideoRecord).filter_by(
13460 processing_state=FocusVideoStateInternal.PURCHASE_PENDING,
13461 miner_hotkey=miner_hotkey
13462 ).all()
13463 return items
13464
13465 except Exception as e:
13466 print(e)
13467 raise HTTPException(500, detail="Internal error")
13468
13469async def check_availability(
13470 db: Session,
13471 video_id: str,
13472 miner_hotkey: str,
13473 with_lock: bool = False
13474):
13475 try:
13476 video_record = db.query(FocusVideoRecord).filter(
13477 FocusVideoRecord.video_id == video_id,
13478 FocusVideoRecord.deleted_at.is_(None),
13479 FocusVideoRecord.processing_state == FocusVideoStateInternal.SUBMITTED, # is available for purchase
13480 FocusVideoRecord.expected_reward_tao > MIN_REWARD_TAO,
13481 )
13482 if with_lock:
13483 video_record = video_record.with_for_update()
13484 video_record = video_record.first()
13485
13486 if video_record is None:
13487 return {
13488 'status': 'error',
13489 'message': f'video {video_id} not found or not available for purchase'
13490 }
13491
13492 if video_record.expected_reward_tao is None:
13493 raise HTTPException(500, detail="The video record is missing the expected reward tao, investigate this bug")
13494
13495 # mark the purchase as pending i.e. a miner has claimed the video for purchase and now just needs to pay
13496 video_record.processing_state = FocusVideoStateInternal.PURCHASE_PENDING
13497 video_record.miner_hotkey = miner_hotkey
13498 video_record.updated_at = datetime.utcnow()
13499
13500 # NOTE: we don't set the video_record.earned_reward_tao here, because we don't know if the
13501 # miner will successfully purchase the video or not. We set it later in cron/confirm_purchase.py
13502
13503 db.add(video_record)
13504 db.commit()
13505
13506 return {
13507 'status': 'success',
13508 'price': video_record.expected_reward_tao
13509 }
13510
13511 except Exception as e:
13512 print(e)
13513 raise HTTPException(500, detail="Internal error")
13514
13515def get_purchased_list(
13516 db: Session,
13517 miner_hotkey: str
13518):
13519 try:
13520 purchased_list = db.query(FocusVideoRecord).filter_by(
13521 processing_state=FocusVideoStateInternal.PURCHASED,
13522 miner_hotkey=miner_hotkey
13523 ).all()
13524
13525 # result = [
13526 # {
13527 # "id": video.id,
13528 # "task_id": video.task_id,
13529 # "link": video.link,
13530 # "score": video.score,
13531 # "creator": video.creator,
13532 # "miner_uid": video.miner_uid,
13533 # "miner_hotkey": video.miner_hotkey,
13534 # "estimated_tao": video.estimated_tao,
13535 # "reward_tao": video.reward_tao,
13536 # "status": video.status,
13537 # "created_at": video.created_at,
13538 # "task_str": video.task.focusing_task if video.task else None
13539 # }
13540 # for video in purchased_list
13541 # ]
13542
13543 # FV TODO: again, what is this for????
13544 # for video in purchased_list:
13545 # task = get_task(db, video.task_id)
13546 # video.task_str = task.focusing_task
13547
13548 return purchased_list
13549 except Exception as e:
13550 print(e)
13551 # raise HTTPException(500, detail="Internal error")
13552 return []
13553
13554# def get_consumed_list(
13555# db: Session,
13556# miner_hotkey: str
13557# ):
13558# try:
13559# list = db.query(FocusVideoRecord).filter_by(
13560# processing_state=FocusVideoStateInternal.CONSUMED,
13561# miner_hotkey=miner_hotkey
13562# ).all()
13563
13564# return list
13565# except Exception as e:
13566# print(e)
13567# # raise HTTPException(500, detail="Internal error")
13568# return []
13569
13570async def check_video_metadata(
13571 db: Session,
13572 video_id: str,
13573 user_email: str,
13574 miner_hotkey: str
13575):
13576 try:
13577 video_info = db.query(FocusVideoRecord).filter(
13578 FocusVideoRecord.video_id == video_id,
13579 FocusVideoRecord.user_email == user_email,
13580 FocusVideoRecord.miner_hotkey == miner_hotkey,
13581 FocusVideoRecord.deleted_at.is_(None)
13582 ).first()
13583
13584 if video_info is not None and video_info.processing_state == FocusVideoStateInternal.PURCHASED:
13585
13586 # # FV TODO: why do we need the task info?
13587 # task_info = db.query(models.Task).filter_by(id=video_info.task_id).first()
13588
13589 # if task_info is not None:
13590 # video_info.status = FocusVideoEnum.Submitted
13591 # db.add(video_info)
13592 # db.commit()
13593 # video_score = await score.score_video(task_info.focusing_task, task_info.clip_link)
13594 # print(f"Video score: {video_score}")
13595 # return {
13596 # 'success': True,
13597 # 'score': video_score
13598 # }
13599
13600 # return {
13601 # 'success': False,
13602 # 'message': 'No task found.'
13603 # }
13604
13605 # video_info.processing_state = FocusVideoStateInternal.VALIDATING
13606 db.add(video_info)
13607 db.commit()
13608
13609 # video_score = await score.score_video(task_info.focusing_task, task_info.clip_link)
13610 # print(f"Video score: {video_score}")
13611 video_score = video_info.video_score
13612
13613 return {
13614 'success': True,
13615 'score': video_score
13616 }
13617
13618 return {
13619 'success': False,
13620 'message': 'No video found.'
13621 }
13622
13623 except Exception as e:
13624 print(e)
13625 return {
13626 'success': False,
13627 'message': 'Internal Server Errror'
13628 }
13629
13630# async def consume_video(db: Session, video_ids: str):
13631# print(f"Consuming focus video: <{video_ids}>")
13632# try:
13633# videos = db.query(FocusVideoRecord).filter(
13634# FocusVideoRecord.video_id.in_(video_ids)
13635# ).all()
13636# if len(videos) > 0:
13637# for video in videos:
13638# if video.processing_state == FocusVideoStateInternal.CONSUMED:
13639# return {
13640# 'success': False,
13641# 'message': 'Already consumed.'
13642# }
13643# video.processing_state = FocusVideoStateInternal.CONSUMED
13644# db.add(video)
13645# db.commit()
13646# return {
13647# 'success': True
13648# }
13649# else:
13650# return {
13651# 'success': False,
13652# 'message': 'No Video Found'
13653# }
13654# except Exception as e:
13655# print(e)
13656# return {
13657# 'success': False,
13658# 'message': 'Internal Server Error'
13659# }
13660
13661# def add_task_str(db:Session, video: any):
13662# task = get_task(db, video.task_id)
13663# video.task_str = task.focusing_task
13664# return video
13665
13666def get_video_owner_coldkey(db: Session, video_id: str) -> str:
13667 video_record = db.query(FocusVideoRecord).filter(
13668 FocusVideoRecord.video_id == video_id,
13669 FocusVideoRecord.deleted_at.is_(None)
13670 )
13671 video_record = video_record.first()
13672
13673 if video_record is None:
13674 raise HTTPException(404, detail="Focus video not found")
13675
13676 user_record = db.query(UserRecord).filter(UserRecord.email == video_record.user_email,).first()
13677 if user_record is None:
13678 raise HTTPException(404, detail="User not found")
13679
13680 return user_record.coldkey
13681
13682_already_purchased_cache = CachedValue()
13683
13684async def _already_purchased_max_focus_tao() -> bool:
13685 with get_db_context() as db:
13686 effective_max_focus_tao = await get_purchase_max_focus_tao()
13687 total_earned_tao = db.query(func.sum(FocusVideoRecord.earned_reward_tao)).filter(
13688 FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASED,
13689 FocusVideoRecord.updated_at >= datetime.utcnow() - timedelta(hours=24)
13690 ).scalar() or 0
13691 return total_earned_tao >= effective_max_focus_tao
13692
13693async def already_purchased_max_focus_tao() -> bool:
13694 return await _already_purchased_cache.get_or_update(
13695 lambda: _already_purchased_max_focus_tao()
13696 )
13697
13698class MinerPurchaseStats(BaseModel):
13699 purchased_videos: List[FocusVideoInternal]
13700 total_focus_points: float
13701 max_focus_points: float
13702 focus_points_percentage: float
13703
13704async def get_miner_purchase_stats(db: Session, miner_hotkey: str) -> MinerPurchaseStats:
13705 # Get videos purchased by miner in the last 24 hours
13706 purchased_videos_records = db.query(FocusVideoRecord).filter(
13707 FocusVideoRecord.miner_hotkey == miner_hotkey,
13708 FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASED,
13709 FocusVideoRecord.updated_at >= datetime.utcnow() - timedelta(hours=24)
13710 )
13711 purchased_videos_records = purchased_videos_records.all()
13712
13713 purchased_videos = [
13714 FocusVideoInternal.model_validate(video_record)
13715 for video_record in purchased_videos_records
13716 ]
13717
13718 # Calculate total score for purchased videos (focus points = score * 100)
13719 total_focus_points = sum(video.video_score * 100 for video in purchased_videos)
13720
13721 # Calculate percentage
13722 max_focus_tao = await get_max_focus_tao()
13723 max_focus_points = get_max_focus_points_available_today(max_focus_tao)
13724 focus_points_percentage = total_focus_points / max_focus_points if max_focus_points > 0 else 0
13725
13726 return MinerPurchaseStats(
13727 purchased_videos=purchased_videos,
13728 total_focus_points=total_focus_points,
13729 max_focus_points=max_focus_points,
13730 focus_points_percentage=focus_points_percentage
13731 )
13732
13733def set_focus_video_score(db: Session, video_id: str, score_details: VideoScore, embeddings: FocusVideoEmbeddings):
13734 video_record = db.query(FocusVideoRecord).filter(
13735 FocusVideoRecord.video_id == video_id,
13736 FocusVideoRecord.deleted_at.is_(None)
13737 ).first()
13738 if video_record is None:
13739 raise HTTPException(404, detail="Focus video not found")
13740
13741 video_record.video_score = score_details.final_score
13742 video_record.video_details = {
13743 **video_record.video_details,
13744 **json.loads(score_details.model_dump_json()),
13745 }
13746 video_record.embeddings = json.loads(embeddings.model_dump_json())
13747 video_record.processing_state = FocusVideoStateInternal.READY
13748 video_record.updated_at = datetime.utcnow()
13749 video_record.task_type = TaskType.BOOSTED if score_details.boosted_multiplier > 1.0 else TaskType.USER
13750 db.add(video_record)
13751 db.commit()
13752
13753def mark_video_rejected(
13754 db: Session,
13755 video_id: str,
13756 rejection_reason: str,
13757 score_details: Optional[VideoScore]=None,
13758 embeddings: Optional[FocusVideoEmbeddings]=None,
13759 exception_string: Optional[str]=None,
13760):
13761 video_record = db.query(FocusVideoRecord).filter(
13762 FocusVideoRecord.video_id == video_id,
13763 FocusVideoRecord.deleted_at.is_(None)
13764 ).first()
13765 if video_record is None:
13766 raise HTTPException(404, detail="Focus video not found")
13767
13768 video_details = { **video_record.video_details }
13769
13770 if score_details:
13771 video_details = {
13772 **video_details,
13773 **json.loads(score_details.model_dump_json()),
13774 }
13775
13776 if exception_string:
13777 video_details["exception"] = exception_string
13778
13779 if score_details or exception_string:
13780 video_record.video_details = video_details
13781
13782 if embeddings:
13783 video_record.embeddings = json.loads(embeddings.model_dump_json())
13784
13785 video_record.processing_state = FocusVideoStateInternal.REJECTED
13786 video_record.rejection_reason = rejection_reason
13787 db.add(video_record)
13788 db.commit()
13789
13790def mark_video_submitted(db: Session, video_id: str, with_lock: bool = False):
13791 # Mark video as "SUBMITTED" if in the "PURCHASE_PENDING" state.
13792 video_record = db.query(FocusVideoRecord).filter(
13793 FocusVideoRecord.video_id == video_id,
13794 FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASE_PENDING,
13795 FocusVideoRecord.deleted_at.is_(None)
13796 )
13797 if with_lock:
13798 video_record = video_record.with_for_update()
13799 video_record = video_record.first()
13800
13801 if video_record is None:
13802 raise HTTPException(404, detail="Focus video not found or not in the correct state: PURCHASE_PENDING")
13803
13804 video_record.processing_state = FocusVideoStateInternal.SUBMITTED
13805 video_record.updated_at = datetime.utcnow()
13806 db.add(video_record)
13807 db.commit()
13808
13809_focus_points_cache = CachedValue(duration=60) # Cache for 60 seconds
13810
13811async def _fetch_focus_points(db: Session) -> Dict[TaskType, float]:
13812 results = db.query(
13813 FocusVideoRecord.task_type,
13814 func.sum(
13815 func.cast(FocusVideoRecord.video_details['duration'].astext, Float) *
13816 FocusVideoRecord.video_score
13817 ).label('focus_points')
13818 ).filter(
13819 FocusVideoRecord.processing_state.in_([
13820 FocusVideoStateInternal.SUBMITTED,
13821 FocusVideoStateInternal.PURCHASED
13822 ]),
13823 FocusVideoRecord.created_at >= datetime.utcnow() - timedelta(hours=24)
13824 ).group_by(FocusVideoRecord.task_type).all()
13825
13826 # Initialize dict with all TaskType values set to 0
13827 focus_points = {task_type: 0 for task_type in TaskType}
13828
13829 # Update with actual results
13830 for task_type, points in results:
13831 focus_points[task_type] = points or 0
13832
13833 return focus_points
13834
13835async def get_focus_points_from_last_24_hours(db: Session) -> Dict[TaskType, float]:
13836 return await _focus_points_cache.get_or_update(
13837 lambda: _fetch_focus_points(db)
13838 )
13839
13840
13841---
13842File: /validator-api/validator_api/database/models/__init__.py
13843---
13844
13845
13846
13847
13848---
13849File: /validator-api/validator_api/database/models/boosted_task.py
13850---
13851
13852from sqlalchemy import Column, String, Float, Integer, DateTime, Boolean
13853from validator_api.database import Base
13854from datetime import datetime
13855
13856class BoostedTask(Base):
13857 __tablename__ = 'boosted_tasks'
13858
13859 id = Column(Integer, primary_key=True)
13860 created_at = Column(DateTime, nullable=False, default=datetime.utcnow)
13861 title = Column(String(1000), nullable=False)
13862 description = Column(String(1000), nullable=False)
13863 multiplier = Column(Float, nullable=False)
13864 active = Column(Boolean, nullable=False, default=True)
13865
13866
13867
13868---
13869File: /validator-api/validator_api/database/models/focus_video_record.py
13870---
13871
13872from datetime import datetime
13873import uuid
13874from typing import Optional
13875
13876from pydantic import BaseModel, ConfigDict
13877from sqlalchemy import Column, String, DateTime, Float, Enum, Integer
13878
13879from validator_api.database import Base
13880from sqlalchemy.dialects.postgresql import JSONB
13881from validator_api.config import DB_STRING_LENGTH
13882
13883import enum
13884
13885class TaskType(enum.Enum):
13886 USER = "USER"
13887 BOOSTED = "BOOSTED"
13888
13889class FocusVideoStateExternal(enum.Enum):
13890 PROCESSING = "PROCESSING"
13891 READY = "READY"
13892 REJECTED = "REJECTED"
13893 SUBMITTED = "SUBMITTED"
13894 REWARDED = "REWARDED"
13895
13896class FocusVideoStateInternal(enum.Enum):
13897 # OMEGA Focus user facing states
13898 PROCESSING = "PROCESSING" # User has completed task, we are currently calculating their score and checking if the video is legit
13899 READY = "READY" # Score has been calculated and task is eligible for submission
13900 REJECTED = "REJECTED" # Turns out that the task was NOT eligible for submission, lifecycle ended here
13901 SUBMITTED = "SUBMITTED" # User has pressed "Submit" and the task is now listed on the marketplace, for SN24 miners to buy
13902
13903 # Miner purchase states
13904 PURCHASE_PENDING = "PURCHASE_PENDING" # a miner has request to buy the video, and we have sent them the amount of tao that they need to send the focus user
13905 PURCHASED = "PURCHASED" # our background cron has confirmed that the miner has bought the focus video
13906
13907 # I think that these 2 states don't even need to exist?
13908 # VALIDATING = "VALIDATING"
13909 # CONSUMED = "CONSUMED"
13910
13911def map_focus_video_state(state: FocusVideoStateInternal) -> FocusVideoStateExternal:
13912 """
13913 The first 4 states are the ones that the user sees. The last 4 states are the ones that the
13914 miner sees. All the user needs to know is whether the video has been purchased by a miner.
13915 """
13916 state_mapping = {
13917 FocusVideoStateInternal.PROCESSING: FocusVideoStateExternal.PROCESSING,
13918 FocusVideoStateInternal.READY: FocusVideoStateExternal.READY,
13919 FocusVideoStateInternal.REJECTED: FocusVideoStateExternal.REJECTED,
13920 FocusVideoStateInternal.SUBMITTED: FocusVideoStateExternal.SUBMITTED,
13921 FocusVideoStateInternal.PURCHASE_PENDING: FocusVideoStateExternal.SUBMITTED,
13922 FocusVideoStateInternal.PURCHASED: FocusVideoStateExternal.REWARDED,
13923 # FocusVideoStateInternal.VALIDATING: FocusVideoStateExternal.REWARDED,
13924 # FocusVideoStateInternal.CONSUMED: FocusVideoStateExternal.REWARDED,
13925 }
13926 if state in state_mapping:
13927 return state_mapping[state]
13928 else:
13929 raise ValueError(f"Invalid focus video state: {state}")
13930
13931class FocusVideoRecord(Base):
13932 __tablename__ = 'focus_videos'
13933
13934 video_id = Column(String(DB_STRING_LENGTH), primary_key=True, default=lambda: str(uuid.uuid4()), nullable=False)
13935 task_id = Column(String(DB_STRING_LENGTH), nullable=False)
13936 user_id = Column(String, nullable=False)
13937 user_email = Column(String, nullable=False)
13938 processing_state = Column(Enum(FocusVideoStateInternal), nullable=False, default=FocusVideoStateInternal.PROCESSING)
13939 task_type = Column(Enum(TaskType), nullable=False, default=TaskType.USER)
13940 video_score = Column(Float, nullable=True)
13941 video_details = Column(JSONB, nullable=True)
13942 embeddings = Column(JSONB, nullable=True)
13943 rejection_reason = Column(String(1000), nullable=True)
13944 expected_reward_tao = Column(Float, nullable=True)
13945 earned_reward_tao = Column(Float, nullable=True)
13946 miner_hotkey = Column(String(DB_STRING_LENGTH), nullable=True)
13947 extrinsic_id = Column(String(DB_STRING_LENGTH), nullable=True)
13948 created_at = Column(DateTime, default=datetime.utcnow)
13949 updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
13950 deleted_at = Column(DateTime, nullable=True)
13951
13952 def get_duration(self) -> float:
13953 return float(self.video_details.get("duration", 0.0))
13954
13955class FocusVideoBase(BaseModel):
13956 video_id: str
13957 task_id: str
13958 user_email: str
13959 task_type: TaskType
13960 video_score: Optional[float]
13961 rejection_reason: Optional[str]
13962 expected_reward_tao: Optional[float]
13963 earned_reward_tao: Optional[float]
13964 created_at: datetime
13965 updated_at: datetime
13966 deleted_at: Optional[datetime]
13967
13968class FocusVideoInternal(FocusVideoBase):
13969 model_config = ConfigDict(from_attributes=True)
13970
13971 processing_state: FocusVideoStateInternal
13972 miner_hotkey: Optional[str]
13973
13974
13975
13976---
13977File: /validator-api/validator_api/database/models/task.py
13978---
13979
13980from sqlalchemy import Column, String, Boolean, Float, DateTime, Integer
13981from validator_api.config import DB_STRING_LENGTH
13982from validator_api.database import Base
13983from datetime import datetime
13984from pydantic import BaseModel, ConfigDict
13985from typing import Optional
13986
13987class TaskRecordPG(Base):
13988 __tablename__ = 'tasks'
13989 id = Column(String(DB_STRING_LENGTH), primary_key=True, nullable=False)
13990 info = Column(String(DB_STRING_LENGTH))
13991 description = Column(String(DB_STRING_LENGTH))
13992 checked = Column(Boolean, default=False)
13993 date = Column(DateTime, default=datetime.utcnow)
13994 theme = Column(String(DB_STRING_LENGTH), nullable=True)
13995 score = Column(Float)
13996 user_id = Column(String(DB_STRING_LENGTH))
13997 chat_id = Column(String(DB_STRING_LENGTH), nullable=True)
13998 reason = Column(String(DB_STRING_LENGTH), nullable=True)
13999 boosted_id = Column(Integer, nullable=True)
14000
14001
14002class Task(BaseModel):
14003 model_config = ConfigDict(from_attributes=True)
14004
14005 id: str
14006 info: str
14007 description: str
14008 checked: bool
14009 date: datetime
14010 theme: Optional[str]
14011 score: float
14012 user_id: str
14013 chat_id: Optional[str]
14014 reason: Optional[str]
14015 boosted_id: Optional[int]
14016
14017
14018
14019---
14020File: /validator-api/validator_api/database/models/user.py
14021---
14022
14023from datetime import datetime
14024
14025from sqlalchemy import Column, String, Float, DateTime
14026from pydantic import BaseModel
14027
14028from validator_api.config import DB_STRING_LENGTH, DB_STRING_LENGTH_LONG
14029from validator_api.database import Base
14030
14031
14032class UserRecord(Base):
14033 __tablename__ = 'users'
14034
14035 id = Column(String, primary_key=True, nullable=False)
14036 email = Column(String(DB_STRING_LENGTH), primary_key=True, nullable=False)
14037 name = Column(String(DB_STRING_LENGTH))
14038 coldkey = Column(String(DB_STRING_LENGTH))
14039 hotkey = Column(String(DB_STRING_LENGTH))
14040 tao_balance = Column(Float)
14041 tao_check_time = Column(DateTime, nullable=True)
14042 focused_task_id = Column(String(DB_STRING_LENGTH), nullable=True)
14043 created_at = Column(DateTime, default=datetime.utcnow)
14044
14045
14046class User(BaseModel):
14047 id: str
14048 email: str
14049 name: str
14050 tao_balance: float
14051 tao_check_time: datetime
14052 focused_task_id: str
14053 created_at: datetime
14054
14055
14056class UserInternal(BaseModel):
14057 coldkey: str
14058 hotkey: str
14059
14060
14061
14062---
14063File: /validator-api/validator_api/database/__init__.py
14064---
14065
14066from validator_api import config
14067from sqlalchemy import create_engine
14068from sqlalchemy.schema import MetaData
14069from sqlalchemy.ext.declarative import declarative_base
14070from sqlalchemy.orm import sessionmaker
14071from contextlib import contextmanager
14072
14073DB_HOST = config.FOCUS_DB_HOST
14074DB_NAME = config.FOCUS_DB_NAME
14075DB_USER = config.FOCUS_DB_USER
14076DB_PASSWORD = config.FOCUS_DB_PASSWORD
14077DB_PORT = config.FOCUS_DB_PORT
14078
14079DATABASE_URL = f"postgresql+psycopg2://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"
14080
14081engine = create_engine(
14082 DATABASE_URL,
14083 pool_size=20, # bumped up from default of 5
14084 max_overflow=25, # bumped up from default of 10
14085 pool_timeout=15, # bumped down from default of 30
14086 pool_pre_ping=True, # Good practice for most scenarios
14087 pool_recycle=3600, # Recycle connections after 1 hour
14088)
14089SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
14090Base = declarative_base()
14091metadata = MetaData()
14092
14093def get_db():
14094 db = SessionLocal()
14095 try:
14096 yield db
14097 finally:
14098 db.close()
14099
14100def get_db_context():
14101 return contextmanager(get_db)()
14102
14103
14104
14105---
14106File: /validator-api/validator_api/database/encrypted_json.py
14107---
14108
14109from cryptography.fernet import Fernet, InvalidToken
14110import json
14111from typing import Optional, Union
14112
14113from sqlalchemy.types import TypeDecorator, LargeBinary
14114from sqlalchemy.engine.interfaces import Dialect
14115from pydantic import BaseModel
14116
14117from validator_api.config import ENCRYPTION_KEY
14118
14119
14120fernet = Fernet(ENCRYPTION_KEY)
14121
14122# Type alias for any valid JSON type, including Pydantic BaseModel
14123JSONType = Union[dict, list, str, int, float, bool, None, BaseModel]
14124
14125
14126class EncryptedJSON(TypeDecorator):
14127 # For MySQL, the default limit here is 64 kb. In the prod DB, I (Salman) set it to 4GB.
14128 impl = LargeBinary
14129
14130 def process_bind_param(self, value: Optional[JSONType], dialect: Dialect) -> Optional[bytes]:
14131 if value is not None:
14132 try:
14133 return encrypt_data(value)
14134 except (TypeError, ValueError) as e:
14135 raise ValueError(f"Error encrypting data: {str(e)}")
14136 return None
14137
14138 def process_result_value(self, value: Optional[bytes], dialect: Dialect) -> Optional[JSONType]:
14139 if value is not None:
14140 try:
14141 return decrypt_data(value)
14142 except (InvalidToken, json.JSONDecodeError) as e:
14143 raise ValueError(f"Error decrypting data: {str(e)}")
14144 return None
14145
14146
14147def encrypt_data(data: JSONType) -> bytes:
14148 try:
14149 if isinstance(data, BaseModel):
14150 data = json.loads(data.model_dump_json())
14151 return fernet.encrypt(json.dumps(data).encode())
14152 except (TypeError, ValueError) as e:
14153 raise ValueError(f"Error encoding or encrypting data: {str(e)}")
14154
14155
14156def decrypt_data(encrypted_data: bytes) -> JSONType:
14157 try:
14158 decrypted_data = fernet.decrypt(encrypted_data)
14159 return json.loads(decrypted_data.decode())
14160 except InvalidToken:
14161 raise ValueError("Invalid token or key used for decryption")
14162 except json.JSONDecodeError:
14163 raise ValueError("Decrypted data is not valid JSON")
14164
14165
14166class LargeEncryptedJSON(EncryptedJSON):
14167 impl = LargeBinary(length=4 * 1024 * 1024 * 1024 - 1) # 4 GB - 1 byte because thats the MySQL max
14168
14169class MediumEncryptedJSON(EncryptedJSON):
14170 impl = LargeBinary(length=16 * 1024 * 1024 - 1) # 16 MB - 1 byte (MySQL MEDIUMBLOB max size)
14171
14172def test_encrypted_json():
14173 encrypted_json_type = EncryptedJSON()
14174
14175 class FakeModel(BaseModel):
14176 name: str
14177 value: int
14178
14179 class NestedFakeModel(BaseModel):
14180 nested: FakeModel
14181
14182 # Test with different JSON types
14183 test_cases = [
14184 {"key": "value"}, # dict
14185 ["item1", "item2"], # list
14186 "string", # str
14187 42, # int
14188 3.14, # float
14189 True, # bool
14190 None, # null
14191 {"nested": {"list": [1, 2, 3], "dict": {"a": 1, "b": 2}}}, # complex nested structure
14192 FakeModel(name="Test", value=123), # Pydantic BaseModel
14193 NestedFakeModel(nested=FakeModel(name="Nested", value=456)), # Nested Pydantic BaseModel
14194 ]
14195
14196 for case in test_cases:
14197 # Simulate database write
14198 encrypted = encrypted_json_type.process_bind_param(case, None)
14199
14200 # Simulate database read
14201 decrypted = encrypted_json_type.process_result_value(encrypted, None)
14202
14203 if isinstance(case, BaseModel):
14204 assert type(case)(**decrypted) == case, f"Failed for case: {case}"
14205 else:
14206 assert decrypted == case, f"Failed for case: {case}"
14207 print(f"Success: {case}")
14208
14209
14210if __name__ == "__main__":
14211 test_encrypted_json()
14212
14213
14214
14215---
14216File: /validator-api/validator_api/database/schemas.py
14217---
14218
14219from datetime import datetime
14220import enum
14221from typing import List, Optional
14222from pydantic import BaseModel, Field
14223
14224class TaskStatusEnum(enum.Enum):
14225 Ready = 'Ready'
14226 Running = 'Running'
14227 Stopped = 'Stopped'
14228 Completed = 'Completed'
14229
14230class FocusVideoEnum(enum.Enum):
14231 Uploaded = 'Uploaded'
14232 Available = 'Available'
14233 Pending = 'Pending'
14234 Purchased = 'Purchased'
14235 Submitted = 'Submitted'
14236 Consumed = 'Consumed'
14237
14238class TaskSchema(BaseModel):
14239 focusing_task: str = Field(...)
14240 duration: float | None = None
14241 description: str | None = None
14242 checked: bool | None = None
14243 date: str | None = None
14244 clip_link: str | None = None
14245 status: str | None = None
14246 score: float | None = None
14247 event: dict | None = None
14248
14249class UserSchema(BaseModel):
14250 email: str = Field(...)
14251 password: str = Field(...)
14252 nick_name: str = Field(...)
14253
14254class UserLoginSchema(BaseModel):
14255 email: str = Field(...)
14256 password: str = Field(...)
14257
14258class IpfsUrlSchema(BaseModel):
14259 url: str = Field(...)
14260 miner_hotkey: str = Field(...)
14261
14262class TimeSlot(BaseModel):
14263 start: str
14264 end: str
14265
14266class FocusTask(BaseModel):
14267 id: str
14268 name: str
14269 priority: str
14270 timeSlot: TimeSlot
14271 description: str
14272 steps: List[str]
14273 resources: List[str]
14274 challenges: List[str]
14275 focusTips: List[str]
14276 isCompleted: bool
14277 totalDuration: str
14278 category: Optional[str] = None
14279
14280class Metadata(BaseModel):
14281 date: str
14282 day: str
14283 lastUpdated: datetime
14284
14285class DailySchedule(BaseModel):
14286 metadata: Metadata
14287 tasks: List[FocusTask]
14288 tools: List[str]
14289
14290
14291class Link(BaseModel):
14292 url: str = Field(..., description="URL of the website")
14293 name: str = Field(..., description="Name of the website")
14294
14295class Step(BaseModel):
14296 title: str = Field(..., description="Title of the step")
14297 content: List[str] = Field(..., description="Content of the step in paragraphs")
14298 links: Optional[List[Link]] = Field(None, description="Relevant links for the step")
14299
14300class KeyPoint(BaseModel):
14301 title: str = Field(..., description="Title of the key point")
14302 details: List[str] = Field(..., description="Details of the key point")
14303 links: Optional[List[Link]] = Field(None, description="Relevant links for the key point")
14304
14305class Analysis(BaseModel):
14306 summary: str = Field(..., description="Summary of the analysis")
14307 points: List[str] = Field(..., description="Key points or recommendations")
14308 links: Optional[List[Link]] = Field(None, description="Relevant links for the analysis")
14309
14310class TextAnalysisReport(BaseModel):
14311 title: str = Field(..., description="Title of the report")
14312 introduction: str = Field(..., description="Introduction or overview of the report")
14313 steps: List[Step] = Field(..., description="Main steps of the report")
14314 keypoints: List[KeyPoint] = Field(..., description="Key points or findings")
14315 analysis: Analysis = Field(..., description="Overall analysis or conclusion")
14316 metadata: List[str] = Field(..., description="Additional metadata about the report")
14317 timestamp: str = Field(..., description="Timestamp of the report generation (ISO 8601 date string YYYY-MM-DDTHH:MM:SS-UTC)")
14318 links: Optional[List[Link]] = Field(None, description="General links for the entire report")
14319
14320class FocusTask(BaseModel):
14321 id: str
14322 name: str
14323 priority: str
14324 timeSlot: TimeSlot
14325 description: str
14326 steps: List[str]
14327 resources: List[str]
14328 challenges: List[str]
14329 focusTips: List[str]
14330 isCompleted: bool
14331 totalDuration: str
14332 category: Optional[str] = None
14333
14334
14335
14336---
14337File: /validator-api/validator_api/services/__init__.py
14338---
14339
14340
14341
14342
14343---
14344File: /validator-api/validator_api/services/focus_scoring_prompts.py
14345---
14346
14347
14348TASK_VALUATION_CRITERIA = """The kind of tasks that we want to see:
14349- Tasks that contribute to scientific discovery or AI advancement
14350- Creative acts that result in the creation of something new
14351- Tasks that demonstrate Chain of Thought (CoT) and are useful for training AI
14352- High novelty in approach or outcome
14353- Tasks that current AI systems struggle with
14354- Videos of coding, AI research, or solving AI engineering problems
14355- Application of the scientific process, including designing and implementing experiments
14356- Tasks that seek more knowledge or demonstrate critical thinking and creation
14357- Students learning challenging new material
14358- Office workers efficiently completing assigned work
14359
14360The kind of tasks that we don't want to see:
14361- Extremely mundane, boring, or repetitive tasks
14362- Tasks that can be easily completed by existing AI systems (e.g., basic copywriting)
14363- Tasks already present in existing datasets"""
14364
14365TASK_SCORE_SYSTEM_PROMPT = f"""
14366You are an AI tasked with evaluating proposed tasks for a cryptocurrency reward system. The goal is to encourage tasks that contribute to scientific discovery, AI advancement, creativity, education, and productivity while avoiding repetitive, busywork, or unproductive tasks.
14367
14368Here are the criteria for tasks:
14369
14370{TASK_VALUATION_CRITERIA}
14371
14372You will evaluate the task based on this rubric:
14373- Relevance: How well does the task align with what we want and avoid what we don't want?
14374- Impact: How significant is the task's potential contribution to our goals?
14375- Feasibility: Is the task realistic, achievable, and well-defined?
14376- Efficiency: Does the task make good use of resources in pursuing our objectives?
14377"""
14378
14379TASK_SCORE_USER_PROMPT = """
14380Here is the task to evaluate:
14381
14382<task_description>
14383{task_overview}
14384</task_description>
14385
14386Analyze this task based on the provided criteria and rubric. Consider both positive and negative aspects, and explain your thought process thoroughly.
14387
14388Provide your reasoning for why the task is or is not a good fit for the goal. Discuss how it aligns with or deviates from the criteria for what we want and don't want. Evaluate its potential impact, feasibility, and efficiency.
14389
14390After providing your reasoning, assign a score between 0.0 and 1.0 to indicate how well the task fits our goals. Use this scale:
14391- 0.0-0.2: Poor fit, largely irrelevant or counterproductive
14392- 0.2-0.4: Weak fit, minimal contribution to the goal
14393- 0.4-0.6: Moderate fit, somewhat helpful but not ideal
14394- 0.6-0.8: Good fit, clearly contributes to the goal
14395- 0.8-1.0: Excellent fit, highly effective in achieving the goal
14396
14397Remember to adhere to the JSON schema provided.
14398"""
14399
14400DETAILED_DESCRIPTION_SYSTEM_PROMPT = """
14401You are tasked with watching a screen recording of a human performing a task and creating a detailed annotation of the process. Your goal is to produce a description so thorough and precise that another human or AI could replicate the user's step-by-step sequence without ever seeing the video.
14402
14403After watching the video, you will create an annotation following the DetailedVideoDescription schema. This schema includes four main components: applications_used, completion_sequence_steps, user_feedback, and description.
14404
14405For each component of the schema, follow these guidelines:
14406
144071. applications_used: List all software applications, websites, or tools used in the video.
14408
144092. completion_sequence_steps: Provide a highly detailed, step-by-step breakdown of the entire process. Each step should be clear, concise, and actionable. Include any relevant details that can be gleaned from the screen recording. Number each step for clarity.
14410
144113. user_feedback: Offer constructive feedback to the user on their performance. Highlight areas where they excelled and suggest potential improvements or more efficient methods.
14412
144134. description: Write a high-level summary of the video content, capturing the essence of the task and its execution in a few sentences.
14414
14415When writing your annotation, be as precise and detailed as possible. Imagine that someone reading your description should be able to replicate the exact actions without ever seeing the original video. Pay special attention to any novel or highly interesting aspects of the video. Detail such aspects more thoroughly.
14416"""
14417
14418DETAILED_DESCRIPTION_USER_PROMPT = """
14419Watch the provided video carefully, paying close attention to every action taken by the user. Take note of the applications used, the sequence of steps performed, and any notable techniques employed.
14420
14421Note that the user is completing a task that is described as follows:
14422
14423<task_description>
14424{task_overview}
14425</task_description>
14426
14427Then, write a detailed description based on the criteria outlined. Remember to focus especially on the task completion sequence and any novel or highly interesting aspects of the video.
14428
14429Remember to be thorough, clear, and precise in your annotation. Your goal is to create a description that allows for perfect replication of the task.
14430
14431Remember to adhere to the JSON schema provided.
14432"""
14433
14434VIDEO_SCORING_SYSTEM_PROMPT = f"""
14435You are an expert in evaluating task completion based on video recordings.
14436Your role is to analyze a screen recording of a user performing a task and provide a detailed breakdown of their performance, focusing on how well they completed the assigned task.
14437
14438You will be provided with:
144391. A task overview describing the assigned task.
144402. The screen recording video of the user performing the task.
144413. A detailed description of the video content.
14442
14443Your goal is to evaluate the user's performance and provide a completion score following the CompletionScore schema.
14444This schema includes a final score and a rationale.
14445
14446For each component of the schema, follow these guidelines:
14447
144481. reasoning_steps: Provide a list of logical steps you took to arrive at your final score. Each step should be prefixed with "Step X: " where X is the step number. Start by first reiterating the task overview and what some steps might look like to complete the task.
14449
144502. focus_score: Evaluate how focused the user was on completing the task, based on their actions. Score between 0.0 and 1.0.
14451
144523. educational_score: Assess how clear the user's steps are and how easy it is to follow along. Score between 0.0 and 1.0.
14453
144544. completion_score: Assess how well the user completed the task, considering their focus, distraction level, and how quickly they completed the task, relative to the task's difficulty. Score between 0.0 and 1.0.
14455
144565. creativity_score: Assess how creative the user's approach to the task was. Score between 0.0 and 1.0.
14457
144586. final_score: Calculate an overall completion score based on your evaluation. Score between 0.0 and 1.0.
14459
144607. rationale: Provide a concise explanation for the given completion score.
14461
14462Be thorough and objective in your evaluation, considering all aspects of the user's performance as described in the video description.
14463
14464Note that not all tasks are created equal. When evaluating the task completion, keep in mind the following criteria:
14465
14466{TASK_VALUATION_CRITERIA}
14467
14468Prioritize higher scores for tasks and completions that align with what we want to see, and lower scores for those that align with what we don't want to see.
14469"""
14470
14471VIDEO_SCORING_USER_PROMPT = """
14472Based on the task description and video provided, please provide a completion score breakdown. Evaluate how well the user completed the assigned task, considering their focus, the novelty of their approach, and overall effectiveness.
14473
14474<task_description>
14475{task_overview}
14476<task_description>
14477{detailed_video_description_string}
14478Use the following rubric to assign the focus_score:
14479- 0.0-0.2: Poor focus, distractions completely derail the task
14480- 0.2-0.4: Weak focus, distractions meaningfully affect the task but are overcome
14481- 0.4-0.6: Moderate focus, distractions are a minor inconvenience
14482- 0.6-0.8: Good focus, little to no distractions
14483- 0.8-1.0: Excellent focus, the user is completely engrossed in the task, in a flow state
14484
14485Use the following rubric to assign the educational_score:
14486- 0.0-0.2: Poor educational quality, the user's steps are unclear or difficult to follow
14487- 0.2-0.4: Weak educational quality, the user's steps can be vageuly followed
14488- 0.4-0.6: Moderate educational quality, the user's steps are clear and easy to follow
14489- 0.6-0.8: Good educational quality, the user's steps are clear and easy to follow
14490- 0.8-1.0: Excellent educational quality, the user's steps are clear and easy to follow
14491
14492Use the following rubric to assign the creativity_score:
14493- 0.0-0.2: Poor creativity, the user's approach is unoriginal or uninteresting, not even enough to get the job done
14494- 0.2-0.4: Weak creativity, the user manages to get the job done but it's not very interesting or creative
14495- 0.4-0.6: Moderate creativity, the user's approach is original and creative
14496- 0.6-0.8: Good creativity, the user's approach is highly creative and innovative
14497- 0.8-1.0: Excellent creativity, the user's approach is groundbreaking and entirely novel
14498
14499Use the following rubric to assign the completion_score:
14500- 0.0-0.2: Poor task completion, largely irrelevant or counterproductive
14501- 0.2-0.4: Weak task completion, minimal contribution to the goal
14502- 0.4-0.6: Moderate task completion, somewhat helpful but not ideal
14503- 0.6-0.8: Good task completion, the task was diligently completed
14504- 0.8-1.0: Excellent task completion, the task was completed with high quality and efficiency
14505
14506For the final_score, use your best judgment to assign a score between 0.0 and 1.0 in light of the reasoning_steps, focus_score, educational_score, creativity_score, and completion_score.
14507
14508Remember to adhere to the JSON schema provided for the CompletionScore.
14509"""
14510
14511TASK_COMPLETION_SYSTEM_PROMPT = """
14512You are an expert in evaluating task completion based on video recordings.
14513Your role is to analyze a screen recording of a user performing a task and provide a detailed breakdown of their performance, focusing on how well they completed the assigned task.
14514Ignore the OMEGA Focus distraction notifications that may appear on the top right of the user's screen.
14515The content of these notifications should not be factored into your evaluation.
14516
14517You will be provided with:
145181. A task overview describing the assigned task.
145192. The screen recording video of the user performing the task.
145203. Detailed description of the user's actions in the video.
14521
14522Your goal is to evaluate the user's performance and provide a completion score following the CompletionScore schema.
14523This schema includes a final score and a rationale.
14524In the rationale, try to reference specific guidelines from the task overview/description to justify your score.
14525"""
14526
14527TASK_COMPLETION_USER_PROMPT = """
14528Based on the provided completion sequence steps and video provided, please provide a completion score breakdown.
14529Evaluate how well the user completed the assigned task, considering their focus and overall effectiveness.
14530Please use the task description to evaluate the user's performance, which may include specific steps needed to complete the task.
14531Ignore the OMEGA Focus distraction notifications that may appear on the top right of the user's screen.
14532EXTREMELY IMPORTANT: Again, the content of these distraction notifications should NOT be factored into your evaluation.
14533
14534This is the task overview:
14535<task_overview>
14536{task_overview}
14537</task_overview>
14538
14539This is the detailed description of the user's actions in the video, to aid you in your evaluation:
14540<completion_sequence_steps>
14541{completion_sequence_steps}
14542</completion_sequence_steps>
14543
14544If the user accomplishes the spirit of the task according to the task title, but does not complete it exactly as described according to the task description, you should still award some score (not 0.0).
14545
14546Use the following rubric to assign the completion_score:
14547- 0.0-0.2: Poor task completion, largely irrelevant or counterproductive
14548- 0.2-0.4: Weak task completion, minimal completion towards the goal
14549- 0.4-0.6: Moderate task completion, somewhat helpful but not ideal, maybe the user was distracted or did not follow the task description
14550- 0.6-0.8: Good task completion, the task was diligently completed
14551- 0.8-1.0: Excellent task completion, the task was completed with high quality and efficiency
14552"""
14553
14554BOOST_SCORING_SYSTEM_PROMPT = """
14555You are part of a system to evaluate and reward users for completing tasks.
14556You will be provided with a list of boosted tasks and their descriptions. Boosted tasks are tasks thatreceive an extra special multiplier to increase their score.
14557You will also be provided with a user's task description and a detailed video description.
14558Your current goal is to determine if the user-provided task matches any of the boosted tasks.
14559Return only the index of the boosted task that the user's task description most closely matches.
14560The user's task may or may not match any of the boosted tasks. If no match is found, return -1.
14561
14562Here are the boosted tasks:
14563{boosted_tasks}
14564"""
14565
14566BOOST_SCORING_USER_PROMPT = """
14567Here is the user's task title:
14568{focusing_task}
14569
14570Here is the detailed task description/breakdown:
14571{focusing_description}
14572"""
14573
14574
14575
14576---
14577File: /validator-api/validator_api/services/scoring_service.py
14578---
14579
14580import asyncio
14581from typing import List, Optional
14582import json
14583import random
14584import time
14585
14586from openai import AsyncOpenAI
14587from pydantic import BaseModel, Field, ValidationError
14588from sqlalchemy.orm import Session
14589import vertexai
14590from vertexai.generative_models import Part
14591from vertexai.preview import caching
14592from vertexai.preview.generative_models import (
14593 GenerativeModel, HarmCategory, HarmBlockThreshold, GenerationConfig,
14594)
14595from vertexai.vision_models import MultiModalEmbeddingModel, Video
14596from vertexai.vision_models import VideoSegmentConfig
14597from pinecone import Pinecone
14598
14599from validator_api.config import GOOGLE_PROJECT_ID, GOOGLE_LOCATION, OPENAI_API_KEY, GOOGLE_CLOUD_BUCKET_NAME, PINECONE_API_KEY
14600from validator_api.services import focus_scoring_prompts
14601from validator_api.utils import run_async, run_with_retries
14602from validator_api.database import get_db_context
14603from validator_api.database.models.focus_video_record import FocusVideoRecord, FocusVideoInternal
14604from validator_api.database.models.boosted_task import BoostedTask
14605from validator_api.database.models.task import TaskRecordPG
14606
14607from typing import Tuple, Optional
14608
14609TWO_MINUTES = 120 # in seconds
14610NINETY_MINUTES = 5400 # in seconds
14611FOCUS_VIDEO_MIN_SCORE = 0.05
14612FOCUS_VIDEO_MAX_SCORE = 1.0
14613MIN_VIDEO_UNIQUENESS_SCORE = 0.02
14614
14615def get_video_metadata(db: Session, video_id: str) -> Optional[FocusVideoInternal]:
14616 return db.query(FocusVideoRecord).filter(
14617 FocusVideoRecord.video_id == video_id,
14618 FocusVideoRecord.deleted_at.is_(None)
14619 ).first()
14620
14621async def query_pinecone(pinecone_index: Pinecone, vector: List[float]) -> float:
14622 async def _internal_async():
14623 response = await run_async(
14624 pinecone_index.query,
14625 vector=vector,
14626 top_k=1,
14627 )
14628 if len(response["matches"]) > 0:
14629 matches = response["matches"]
14630 similarity_score = matches[0]["score"]
14631 # for match in matches:
14632 # print(f"Match:")
14633 # print(f" - Score: {match['score']}")
14634 # print(f" - ID: {match.get('id', 'N/A')}")
14635 # print(f" - Metadata: {match.get('metadata', {})}")
14636 else:
14637 print(f"No pinecone matches, returning 0")
14638 similarity_score = 0
14639 similarity_score = max(0.0, min(similarity_score, 1.0))
14640 return 1.0 - similarity_score
14641 return await run_with_retries(_internal_async)
14642
14643class VideoUniquenessError(Exception):
14644 pass
14645
14646class TaskScoreBreakdown(BaseModel):
14647 reasoning_steps: List[str] = Field(description="Steps of reasoning used to arrive at the final score. Before each step, write the text 'Step X: '")
14648 final_score: float = Field(ge=0, le=1, description="Final score for the task, between 0.0 and 1.0")
14649 rationale: str = Field(description="Compendious user-facing explanation for the given score")
14650
14651class DetailedVideoDescription(BaseModel):
14652 applications_used: List[str] = Field(description="List of applications used in the video for completing the task")
14653 completion_sequence_steps: List[str] = Field(description="Highly detailed step-by-step breakdown of the sequence of steps taken to complete the task")
14654 user_feedback: str = Field(description="Feedback for the user to improve their task completion skills in the future")
14655 description: str = Field(description="High-level summary description of the video content")
14656
14657class CompletionScore(BaseModel):
14658 rationale: str = Field(description="Concise description of how well the user completed the task")
14659 completion_score: float = Field(ge=0, le=1, description="Final completion score, between 0.0 and 1.0")
14660
14661class VideoScore(BaseModel):
14662 # task and video scores
14663 # task_score: float
14664 task_uniqueness_score: Optional[float]
14665 video_completion_score: float
14666 description_uniqueness_score: Optional[float]
14667 video_uniqueness_score: float
14668 boosted_multiplier: Optional[float]
14669 final_score: float
14670
14671 # metadata
14672 task_overview: str
14673 # task_score_breakdown: TaskScoreBreakdown
14674 completion_score_breakdown: CompletionScore
14675 detailed_video_description: DetailedVideoDescription
14676
14677class FocusVideoEmbeddings(BaseModel):
14678 # embeddings
14679 task_overview_embedding: Optional[List[float]]
14680 detailed_video_description_embedding: Optional[List[float]]
14681 video_embedding: List[float]
14682
14683class BoostedTaskIndex(BaseModel):
14684 index: int
14685
14686class BoostedTaskData(BaseModel):
14687 title: str
14688 description: str
14689 multiplier: float
14690
14691def get_s3_path(video_id: str) -> str:
14692 return f"clips/{video_id}.webm"
14693
14694def get_gcs_uri(video_id: str) -> str:
14695 return f"gs://{GOOGLE_CLOUD_BUCKET_NAME}/{get_s3_path(video_id)}"
14696
14697class FocusScoringService:
14698 def __init__(self):
14699 vertexai.init(project=GOOGLE_PROJECT_ID, location=GOOGLE_LOCATION)
14700 self.model_name = "gemini-1.5-pro-001"
14701 print(f"Using model: {self.model_name}")
14702 self.safety_settings = {
14703 HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14704 HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14705 HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14706 HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14707 }
14708 self.temperature = 1.3
14709 self.openai_client = AsyncOpenAI(api_key=OPENAI_API_KEY)
14710 self.task_overview_index = Pinecone(api_key=PINECONE_API_KEY).Index("focus-task-overview-index")
14711 self.video_description_index = Pinecone(api_key=PINECONE_API_KEY).Index("focus-video-description-index")
14712 self.completion_video_index = Pinecone(api_key=PINECONE_API_KEY).Index("focus-completion-video-index")
14713 # [gemini task score, task uniqueness score, completion score, description uniqueness score, video uniqueness score]
14714 self.coefficients = [0.23, 0.16, 0.29, 0.14, 0.18]
14715
14716 # Gemini API call related functions
14717
14718 async def make_gemini_request_with_retries(self, system_prompt: str, user_prompt: str, video_id: str, OutputClassSchema: BaseModel) -> str:
14719 num_retries = 3
14720 for retry_idx in range(num_retries):
14721 try:
14722 start = time.time()
14723 output = await self.make_gemini_request(system_prompt, user_prompt, video_id, OutputClassSchema)
14724 print(f"Got gemini output in {time.time() - start} seconds for {OutputClassSchema.__name__}")
14725 return output
14726 except json.JSONDecodeError as e:
14727 print(f"Error parsing JSON from Gemini response for {OutputClassSchema.__name__}, trying again: {e} ({retry_idx + 1}/{num_retries})")
14728 await asyncio.sleep(1)
14729 except ValidationError as e:
14730 print(f"Error turning parsed JSON into Pydantic object for {OutputClassSchema.__name__}, trying again: {e} ({retry_idx + 1}/{num_retries})")
14731 await asyncio.sleep(1)
14732 except Exception as e:
14733 print(f"Error making Gemini request for {OutputClassSchema.__name__}, trying again: {e} ({retry_idx + 1}/{num_retries})")
14734 await asyncio.sleep(6)
14735 raise Exception(f"Failed to turn Gemini response into JSON and then into Pydantic object for {OutputClassSchema.__name__} after {num_retries} attempts")
14736
14737 async def make_gemini_request(self, system_prompt: str, user_prompt: str, video_id: str, OutputClassSchema: BaseModel) -> GenerativeModel:
14738 model = GenerativeModel(
14739 self.model_name,
14740 system_instruction=system_prompt.strip(),
14741 safety_settings=self.safety_settings,
14742 generation_config=GenerationConfig(
14743 temperature=self.temperature,
14744 response_mime_type="application/json",
14745 response_schema=OutputClassSchema.model_json_schema(),
14746 ),
14747 )
14748
14749 parts = []
14750 if video_id:
14751 parts.append(Part.from_uri(get_gcs_uri(video_id), mime_type="video/webm"))
14752 parts.append(user_prompt.strip())
14753
14754 response = await model.generate_content_async(parts)
14755 return OutputClassSchema(**json.loads(response.text))
14756
14757 async def get_task_score_from_gemini(self, task_overview: str) -> TaskScoreBreakdown:
14758 return await self.make_gemini_request_with_retries(
14759 system_prompt=focus_scoring_prompts.TASK_SCORE_SYSTEM_PROMPT,
14760 user_prompt=focus_scoring_prompts.TASK_SCORE_USER_PROMPT.format(task_overview=task_overview),
14761 video_id=None,
14762 OutputClassSchema=TaskScoreBreakdown,
14763 )
14764
14765 async def get_detailed_video_description(self, video_id: str, task_overview: str) -> DetailedVideoDescription:
14766 return await self.make_gemini_request_with_retries(
14767 system_prompt=focus_scoring_prompts.DETAILED_DESCRIPTION_SYSTEM_PROMPT,
14768 user_prompt=focus_scoring_prompts.DETAILED_DESCRIPTION_USER_PROMPT.format(task_overview=task_overview),
14769 video_id=video_id,
14770 OutputClassSchema=DetailedVideoDescription,
14771 )
14772
14773 async def get_completion_score_breakdown(
14774 self,
14775 video_id: str,
14776 task_overview: str,
14777 detailed_video_description: Optional[DetailedVideoDescription] = None,
14778 system_prompt: str = focus_scoring_prompts.TASK_COMPLETION_SYSTEM_PROMPT,
14779 user_prompt: str = focus_scoring_prompts.TASK_COMPLETION_USER_PROMPT,
14780 ) -> CompletionScore:
14781 """
14782 This function generates a completion score breakdown for a given video.
14783
14784 Args:
14785 video_id (str): The ID of the video to be scored.
14786 task_overview (str): An overview of the task associated with the video.
14787 detailed_video_description (Optional[DetailedVideoDescription], optional): A detailed description of the video content. Defaults to None.
14788 system_prompt (str, optional): The system prompt to be used for generating the completion score. Defaults to focus_scoring_prompts.TASK_COMPLETION_SYSTEM_PROMPT.
14789 user_prompt (str, optional): The user prompt to be used for generating the completion score. Defaults to focus_scoring_prompts.TASK_COMPLETION_USER_PROMPT.
14790
14791 Returns:
14792 CompletionScore: The completion score breakdown for the video.
14793
14794 The user_prompt should include {task_overview} and {completion_sequence_steps}.
14795 """
14796 completion_sequence_steps_string = f"""\n\n
14797Additionally, here is a detailed description of the video content that you should reference along with the video:
14798
14799<completion_sequence_steps>
14800{detailed_video_description.completion_sequence_steps}
14801</completion_sequence_steps>
14802""" if detailed_video_description else ""
14803
14804 return await self.make_gemini_request_with_retries(
14805 system_prompt=system_prompt,
14806 user_prompt=user_prompt.format(
14807 task_overview=task_overview,
14808 completion_sequence_steps=completion_sequence_steps_string,
14809 ),
14810 video_id=video_id,
14811 OutputClassSchema=CompletionScore,
14812 )
14813
14814 # Pinecone related functions
14815
14816 async def get_task_uniqueness_score(self, task_overview_embedding: List[float]) -> float:
14817 return await query_pinecone(self.task_overview_index, task_overview_embedding)
14818
14819 async def get_description_uniqueness_score(self, detailed_video_description_embedding: List[float]) -> float:
14820 return await query_pinecone(self.video_description_index, detailed_video_description_embedding)
14821
14822 async def get_video_uniqueness_score(self, video_embedding: List[float]) -> float:
14823 return await query_pinecone(self.completion_video_index, video_embedding)
14824
14825 # Embedding related functions
14826
14827 def get_video_duration_seconds(self, video_id: str) -> int:
14828 with get_db_context() as db:
14829 video_metadata = get_video_metadata(db, video_id)
14830
14831 if video_metadata is None:
14832 raise ValueError(f"Focus video not found: {video_id}")
14833
14834 video_duration_seconds = video_metadata.video_details.get("duration")
14835 if video_duration_seconds is None:
14836 print(f"Video duration not found for video: {video_id}")
14837 video_duration_seconds = 120
14838
14839 return video_duration_seconds
14840
14841 async def get_video_embedding(self, video_id: str, video_duration_seconds: int) -> List[float]:
14842 async def _internal_async():
14843 model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding")
14844 start_offset_sec = random.randint(0, max(0, video_duration_seconds - 120))
14845 end_offset_sec = min(video_duration_seconds, start_offset_sec + 120)
14846 embeddings = await run_async(
14847 model.get_embeddings,
14848 video=Video.load_from_file(get_gcs_uri(video_id)),
14849 video_segment_config=VideoSegmentConfig(
14850 start_offset_sec=start_offset_sec,
14851 end_offset_sec=end_offset_sec,
14852 interval_sec=end_offset_sec - start_offset_sec
14853 )
14854 )
14855 return embeddings.video_embeddings[0].embedding
14856 return await run_with_retries(_internal_async)
14857
14858 async def get_text_embedding(self, text: str) -> Optional[List[float]]:
14859 async def _internal_async():
14860 response = await asyncio.wait_for(self.openai_client.embeddings.create(
14861 input=text,
14862 model="text-embedding-3-large"
14863 ), timeout=10)
14864 return response.data[0].embedding
14865
14866 try:
14867 return await run_with_retries(_internal_async)
14868 except Exception as e:
14869 print(f"Error getting text embedding: {e}")
14870 return None
14871
14872 async def embed_and_get_task_uniqueness_score(self, task_overview: str) -> Tuple[Optional[List[float]], Optional[float]]:
14873 embedding = await self.get_text_embedding(task_overview)
14874 if embedding is None:
14875 return None, None
14876 return embedding, await self.get_task_uniqueness_score(embedding)
14877
14878 async def embed_and_get_video_uniqueness_score(self, video_id: str, video_duration_seconds: int):
14879 embedding = await self.get_video_embedding(video_id, video_duration_seconds)
14880 return embedding, await self.get_video_uniqueness_score(embedding)
14881
14882 async def get_detailed_video_description_embedding_score(self, video_id, task_overview):
14883 detailed_video_description = await self.get_detailed_video_description(video_id, task_overview)
14884 embedding = await self.get_text_embedding(detailed_video_description.model_dump_json())
14885 if embedding is None:
14886 return detailed_video_description, None, None
14887 return detailed_video_description, embedding, await self.get_description_uniqueness_score(embedding)
14888
14889 async def score_video(self, video_id: str, focusing_task: str, focusing_description: str):
14890 """
14891 The video score is a score of how well the user completed the task, based on the task overview and the detailed video description.
14892 If the video is too similar to other videos, it will be rejected.
14893 Errors raised should make the video rejected.
14894 """
14895 boosted_multiplier = 1.0
14896 with get_db_context() as db:
14897 """
14898 if the task is boosted, use the boosted task info directly
14899 """
14900 video_metadata = get_video_metadata(db, video_id)
14901 if video_metadata and video_metadata.task_id:
14902 task = db.query(TaskRecordPG).filter(
14903 TaskRecordPG.id == video_metadata.task_id,
14904 ).first()
14905 if task:
14906 boosted_task = db.query(BoostedTask).filter(
14907 BoostedTask.id == task.boosted_id
14908 ).first()
14909 if boosted_task:
14910 boosted_multiplier = boosted_task.multiplier
14911 focusing_task = boosted_task.title
14912 focusing_description = boosted_task.description
14913 # print(f"Scoring boosted task index {boosted_task.id} multiplier {boosted_multiplier}\n\n{boosted_task.title}\n\n{boosted_task.description}")
14914
14915
14916 video_duration_seconds = self.get_video_duration_seconds(video_id)
14917
14918 if video_duration_seconds < TWO_MINUTES:
14919 raise ValueError(f"Video duration is too short: {video_duration_seconds} seconds")
14920
14921 if video_duration_seconds > NINETY_MINUTES:
14922 raise ValueError(f"Video duration is too long: {video_duration_seconds} seconds")
14923
14924 task_overview = f"# {focusing_task}\n\n{focusing_description}"
14925
14926 (
14927 (task_overview_embedding, task_uniqueness_score),
14928 # task_score_breakdown,
14929 (video_description, video_description_embedding, video_description_uniqueness_score),
14930 (video_embedding, video_uniqueness_score),
14931 ) = await asyncio.gather(
14932 self.embed_and_get_task_uniqueness_score(task_overview), # uses openai to get embedding
14933 # self.get_task_score_from_gemini(task_overview), # uses gemini to score task
14934 self.get_detailed_video_description_embedding_score(video_id, task_overview), # uses gemini to get detailed description
14935 self.embed_and_get_video_uniqueness_score(video_id, video_duration_seconds),
14936 )
14937
14938 if video_uniqueness_score < MIN_VIDEO_UNIQUENESS_SCORE:
14939 raise VideoUniquenessError("Video uniqueness score is too low.")
14940
14941 completion_score_breakdown = await self.get_completion_score_breakdown(
14942 video_id,
14943 task_overview,
14944 detailed_video_description=video_description,
14945 )
14946
14947 completion_gemini_score = completion_score_breakdown.completion_score
14948 final_score = completion_gemini_score * boosted_multiplier
14949
14950 print(f"Final score: {final_score}")
14951 print(f"completion score breakdown: {completion_score_breakdown}")
14952
14953 return VideoScore(
14954 task_uniqueness_score=task_uniqueness_score,
14955 video_completion_score=completion_gemini_score,
14956 description_uniqueness_score=video_description_uniqueness_score,
14957 video_uniqueness_score=video_uniqueness_score,
14958 boosted_multiplier=boosted_multiplier,
14959 final_score=final_score,
14960 task_overview=task_overview,
14961 completion_score_breakdown=completion_score_breakdown,
14962 detailed_video_description=video_description,
14963 ), FocusVideoEmbeddings(
14964 task_overview_embedding=task_overview_embedding,
14965 detailed_video_description_embedding=video_description_embedding,
14966 video_embedding=video_embedding,
14967 )
14968
14969 # async def get_model_cached_on_video(self, video_id: str) -> GenerativeModel:
14970 # video_part = Part.from_uri(get_gcs_uri(video_id), mime_type="video/webm")
14971 # cached_content = caching.CachedContent.create(
14972 # model_name=self.model_name,
14973 # system_instruction="You are an expert video description generator. You are given a video and a task and you need to generate a detailed description of the video.",
14974 # contents=[video_part],
14975 # ttl=datetime.timedelta(minutes=5),
14976 # )
14977 # return GenerativeModel.from_cached_content(cached_content=cached_content)
14978
14979
14980def main():
14981 service = FocusScoringService()
14982 import asyncio
14983 import time
14984
14985 async def main():
14986 video_id = "29f91a6f-1393-4765-ba00-263b4cff28b6"
14987 task_overview = """
14988# Multimodal tokenization research
14989
14990Read the Show-O peper to understand how they have trained a unified diffusion and autoregressive model for multimodal tokenization.
14991""".strip()
14992
14993 score_details = await service.score_video(video_id, task_overview, "description")
14994 print(score_details)
14995
14996 # task_overview_embedding = await service.get_text_embedding(task_overview)
14997 # print(len(task_overview_embedding))
14998
14999 # detailed_video_description = DetailedVideoDescription(
15000 # applications_used=[],
15001 # completion_sequence_steps=[],
15002 # user_feedback="",
15003 # description=""
15004 # )
15005
15006 # video_embedding = await service.get_video_embedding(video_id, 1740)
15007 # print(f"Sum: {sum(video_embedding)}, min: {min(video_embedding)}, max: {max(video_embedding)}")
15008
15009 # task_score_breakdown = await service.get_task_score_from_gemini(task_overview)
15010 # print(task_score_breakdown)
15011
15012 # completion_score_breakdown = await service.get_completion_score_breakdown(video_id, task_overview, detailed_video_description=None)
15013 # print(completion_score_breakdown)
15014
15015 # start = time.time()
15016 # model = service.get_model_cached_on_video(video_id)
15017 # print(f"Got model in {time.time() - start} seconds")
15018 # for _ in range(4):
15019 # start = time.time()
15020 # video_description = await service.get_detailed_video_description_from_cache(model)
15021 # print(f"Got detailed video description ({video_description}) in {time.time() - start} seconds")
15022
15023 # for _ in range(4):
15024 # start = time.time()
15025 # video_description = await service.get_detailed_video_description(video_id)
15026 # print(f"Got detailed video description ({video_description}) in {time.time() - start} seconds")
15027
15028 asyncio.run(main())
15029
15030
15031if __name__ == "__main__":
15032 main()
15033
15034
15035
15036---
15037File: /validator-api/validator_api/utils/__init__.py
15038---
15039
15040import asyncio, functools
15041
15042RETRIES=3
15043DELAY_SECS=2
15044
15045def run_async(func, *args, **kwargs):
15046 loop = asyncio.get_event_loop()
15047 return loop.run_in_executor(None, functools.partial(func, *args, **kwargs))
15048
15049async def run_with_retries(func, *args, **kwargs):
15050 """ func can be sync or async, since we await the output if it's a coroutine """
15051 for i in range(0, RETRIES):
15052 try:
15053 output = func(*args, **kwargs)
15054 if asyncio.iscoroutine(output):
15055 return await output
15056 else:
15057 return output
15058 except:
15059 if i == RETRIES - 1:
15060 raise
15061 await asyncio.sleep(DELAY_SECS)
15062 raise RuntimeError("Should never happen")
15063
15064
15065
15066---
15067File: /validator-api/validator_api/utils/marketplace.py
15068---
15069
15070import time
15071from typing import Tuple, Dict
15072import requests
15073import bittensor as bt
15074from validator_api.config import (
15075 NETWORK, BT_TESTNET, NETUID, FOCUS_REWARDS_PERCENT, FIXED_TAO_USD_ESTIMATE,
15076 BOOSTED_TASKS_PERCENTAGE,
15077)
15078from validator_api.utils import run_with_retries, run_async
15079from validator_api.database.models.focus_video_record import TaskType
15080
15081TASK_TYPE_MAP = {
15082 TaskType.USER: 1 - BOOSTED_TASKS_PERCENTAGE,
15083 TaskType.BOOSTED: BOOSTED_TASKS_PERCENTAGE,
15084}
15085
15086
15087async def get_subtensor_and_metagraph() -> Tuple[bt.subtensor, bt.metagraph]:
15088
15089 def _internal() -> Tuple[bt.subtensor, bt.metagraph]:
15090 subtensor = bt.subtensor(network=NETWORK)
15091 metagraph = bt.metagraph(NETUID)
15092 return subtensor, metagraph
15093
15094 return await run_with_retries(_internal)
15095
15096
15097async def get_tao_price() -> float:
15098 return await run_with_retries(
15099 lambda: float(
15100 requests.get(
15101 "https://api.kucoin.com/api/v1/market/stats?symbol=TAO-USDT"
15102 ).json()["data"]["last"]
15103 )
15104 )
15105
15106# Global cache for max focus TAO
15107max_focus_tao_cache = {
15108 'value': None,
15109 'timestamp': 0
15110}
15111
15112CACHE_DURATION = 30 * 60 # 30 minutes in seconds
15113
15114async def get_max_focus_tao() -> float:
15115 global max_focus_tao_cache
15116 current_time = time.time()
15117
15118 # Check if cached data is still valid
15119 if max_focus_tao_cache['value'] is not None and current_time - max_focus_tao_cache['timestamp'] < CACHE_DURATION:
15120 return max_focus_tao_cache['value']
15121
15122 # If cache is invalid or empty, recalculate
15123 subtensor, metagraph = await get_subtensor_and_metagraph()
15124
15125 def _internal_sync():
15126 current_block = metagraph.block.item()
15127 metagraph.sync(current_block - 10, lite=False, subtensor=subtensor)
15128
15129 total_vali_and_miner_emission = 0
15130 for uid in metagraph.uids.tolist():
15131 total_vali_and_miner_emission += metagraph.emission[uid]
15132
15133 total_miner_emission = total_vali_and_miner_emission / 2 # per tempo
15134 total_miner_emission_per_day = total_miner_emission * 20 # 20 tempo intervals per day
15135 max_focus_tao = total_miner_emission_per_day * FOCUS_REWARDS_PERCENT
15136
15137 if NETWORK == BT_TESTNET:
15138 max_focus_tao = max(2, max_focus_tao)
15139 # max_focus_tao = max(18, max_focus_tao) # 92 tao per day cuz 3.12% emissions * 20% budget
15140
15141 return max_focus_tao
15142
15143 async def _internal_async() -> float:
15144 return await run_async(_internal_sync)
15145
15146 max_focus_tao = await run_with_retries(_internal_async)
15147
15148 # Update cache
15149 max_focus_tao_cache['value'] = max_focus_tao
15150 max_focus_tao_cache['timestamp'] = current_time
15151
15152 return max_focus_tao
15153
15154
15155async def get_purchase_max_focus_tao() -> float:
15156 """we want to limit the amount of focus tao that can be purchased to 90% of the max focus tao so miners can make some profit"""
15157 max_focus_tao = await get_max_focus_tao()
15158 return max_focus_tao * 0.9
15159
15160
15161def get_dollars_available_today(max_focus_tao: float) -> float:
15162 """ Use a fixed TAO - USD estimate to keep consistent for the sake of miner rewards """
15163 return max_focus_tao * FIXED_TAO_USD_ESTIMATE
15164
15165def get_max_focus_points_available_today(max_focus_tao: float) -> float:
15166 # 1 point = 1 dollar
15167 return int(get_dollars_available_today(max_focus_tao))
15168
15169MAX_TASK_REWARD_TAO = 0.1
15170
15171
15172
15173---
15174File: /validator-api/validator_api/utils/wallet.py
15175---
15176
15177import bittensor as bt
15178import aiohttp
15179import time
15180from validator_api.utils import run_with_retries, run_async
15181from typing import List
15182
15183
15184# Global cache for TAO/USD rate
15185tao_usd_cache = {
15186 'rate': None,
15187 'timestamp': 0
15188}
15189
15190CACHE_DURATION = 30 * 60 # 30 minutes in seconds
15191
15192async def get_tao_usd_rate() -> float:
15193 global tao_usd_cache
15194 current_time = time.time()
15195
15196 # Check if cached data is still valid
15197 if tao_usd_cache['rate'] is not None and current_time - tao_usd_cache['timestamp'] < CACHE_DURATION:
15198 return tao_usd_cache['rate']
15199
15200 try:
15201 async with aiohttp.ClientSession() as session:
15202 async with session.get('https://taostats.io/data.json') as response:
15203 if response.status == 200:
15204 data = await response.json()
15205 rate = float(data[0]['price'])
15206
15207 # Update cache
15208 tao_usd_cache['rate'] = rate
15209 tao_usd_cache['timestamp'] = current_time
15210
15211 return rate
15212 else:
15213 print(f"Failed to fetch TAO/USD rate. Status code: {response.status}")
15214 return tao_usd_cache['rate']
15215 except Exception as e:
15216 print(f"Error fetching TAO/USD rate: {str(e)}")
15217 return tao_usd_cache['rate']
15218
15219async def check_wallet_tao_balance(wallet_key: str, subtensor_network: str) -> float:
15220 def _internal_sync() -> float:
15221 subtensor = bt.subtensor(network=subtensor_network)
15222 balance = subtensor.get_balance(wallet_key).tao
15223 return balance
15224
15225 async def _internal_async() -> float:
15226 return await run_async(_internal_sync)
15227
15228 return await run_with_retries(_internal_async)
15229
15230
15231API_URL = "https://api.subquery.network/sq/TaoStats/bittensor-indexer"
15232MAX_TXN = 50
15233GRAPHQL_QUERY = """
15234query ($first: Int!, $after: Cursor, $filter: TransferFilter, $order: [TransfersOrderBy!]!) {
15235 transfers(first: $first, after: $after, filter: $filter, orderBy: $order) {
15236 nodes {
15237 id
15238 from
15239 to
15240 amount
15241 extrinsicId
15242 blockNumber
15243 }
15244 pageInfo {
15245 endCursor
15246 hasNextPage
15247 hasPreviousPage
15248 }
15249 totalCount
15250 }
15251}
15252"""
15253
15254async def get_transaction_from_block_hash(subtensor, wallet_address: str, block_hash: str) -> List[dict]:
15255 """Get all transfers associated with the provided wallet address and block_hash."""
15256 transactions = []
15257 divisor = 1e9
15258
15259 block = subtensor.substrate.get_block(block_hash)
15260 block_num = block['header']['number']
15261
15262 for extrinsic in block['extrinsics']:
15263 extrinsic = extrinsic.value
15264 if 'call' in extrinsic and extrinsic['call']['call_module'] == 'Balances':
15265 if extrinsic['call']['call_function'] in ['transfer', 'transfer_allow_death']:
15266 sender = extrinsic.get('address', 'Unknown')
15267 recipient = extrinsic['call']['call_args'][0]['value']
15268 amount = int(extrinsic['call']['call_args'][1]['value'])
15269
15270 if sender == wallet_address or recipient == wallet_address:
15271 transactions.append({
15272 'id': extrinsic['extrinsic_hash'],
15273 'from': sender,
15274 'to': recipient,
15275 'amount': amount / divisor,
15276 # the Id is not actually supposed to be the hash, but we'll let it fly
15277 # for now cause all we need is a unique identifier, which the hash is
15278 'extrinsicId': extrinsic['extrinsic_hash'],
15279 'blockNumber': block_num
15280 })
15281
15282 return transactions[::-1]
15283
15284
15285
15286---
15287File: /validator-api/validator_api/config.py
15288---
15289
15290import os
15291from dotenv import load_dotenv
15292import json
15293from typing import List
15294import boto3
15295from omega import constants
15296
15297load_dotenv(override=True)
15298
15299def get_secret(secret_name, region_name):
15300 # Create a Secrets Manager client
15301 session = boto3.session.Session()
15302 client = session.client(
15303 service_name='secretsmanager',
15304 region_name=region_name,
15305 )
15306
15307 get_secret_value_response = client.get_secret_value(
15308 SecretId=secret_name
15309 )
15310
15311 # For a list of exceptions thrown, see
15312 # https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
15313
15314 # Decrypts secret using the associated KMS key.
15315 secret = get_secret_value_response['SecretString']
15316
15317 return secret
15318
15319def parse_proxies(proxy_list: List[str]) -> List[str]:
15320 transformed_proxies = []
15321 for proxy in proxy_list:
15322 proxy_ip, proxy_port, proxy_user, proxy_pass = proxy.split(':')
15323 transformed_proxies.append(f"http://{proxy_user}:{proxy_pass}@{proxy_ip}:{proxy_port}")
15324 return transformed_proxies
15325
15326NETWORK = os.environ["NETWORK"]
15327NETUID = int(os.environ["NETUID"])
15328
15329ENABLE_COMMUNE = True if os.environ["ENABLE_COMMUNE"] == "True" else False
15330print("Running with ENABLE_COMMUNE:", ENABLE_COMMUNE)
15331COMMUNE_NETWORK = os.environ["COMMUNE_NETWORK"]
15332COMMUNE_NETUID = int(os.environ["COMMUNE_NETUID"])
15333
15334API_KEY_NAME = "OMEGA_MM_API_KEY"
15335API_KEYS = json.loads(os.environ["API_KEYS"])
15336
15337PINECONE_API_KEY = os.environ["PINECONE_API_KEY"]
15338PINECONE_INDEX = os.environ["PINECONE_INDEX"]
15339PINECONE_AUDIO_INDEX = os.environ["PINECONE_AUDIO_INDEX"]
15340HF_TOKEN = os.environ["HF_TOKEN"]
15341HF_REPO = os.environ["HF_REPO"]
15342HF_AUDIO_REPO = os.environ["HF_AUDIO_REPO"]
15343REPO_TYPE = "dataset"
15344TOPICS_LIST = json.loads(os.environ["TOPICS_LIST"])
15345PROXY_LIST = parse_proxies(json.loads(os.environ["PROXY_LIST"]))
15346IS_PROD = os.environ.get("IS_PROD", "false").lower() == "true"
15347CHECK_PROBABILITY = float(os.environ.get("CHECK_PROBABILITY", 0.1))
15348UPLOAD_BATCH_SIZE = int(os.environ.get("UPLOAD_BATCH_SIZE", 1024))
15349UPLOAD_AUDIO_BATCH_SIZE = int(os.environ.get("UPLOAD_AUDIO_BATCH_SIZE", 256))
15350
15351DB_CONFIG = {
15352 'user': os.environ["DBUSER"],
15353 'password': os.environ["DBPASS"],
15354 'host': os.environ["DBHOST"],
15355 'database': os.environ["DBNAME"]
15356}
15357
15358# Omega Focus Constants
15359FOCUS_DB_HOST = os.environ["FOCUS_DB_HOST"]
15360FOCUS_DB_NAME = os.environ["FOCUS_DB_NAME"]
15361FOCUS_DB_USER = os.environ["FOCUS_DB_USER"]
15362FOCUS_DB_PASSWORD = os.environ["FOCUS_DB_PASSWORD"]
15363FOCUS_DB_PORT = os.getenv("FOCUS_DB_PORT", 5432)
15364DB_STRING_LENGTH = 200
15365DB_STRING_LENGTH_LONG = 500
15366ENCRYPTION_KEY = os.environ["ENCRYPTION_KEY"]
15367
15368BT_TESTNET = "test"
15369BT_MAINNET = "finney"
15370assert NETWORK in [BT_TESTNET, BT_MAINNET], "SUBTENSOR_NETWORK must be either test or finney"
15371TAO_REFRESH_INTERVAL_MINUTES = int(os.getenv('TAO_REFRESH_INTERVAL_MINUTES', 10))
15372
15373FOCUS_REWARDS_PERCENT = float(os.getenv('FOCUS_REWARDS_PERCENT', constants.FOCUS_REWARDS_PERCENT))
15374FOCUS_API_KEYS = json.loads(os.environ["FOCUS_API_KEYS"])
15375GOOGLE_AI_API_KEY = os.environ["GOOGLE_AI_API_KEY"]
15376OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
15377AWS_ACCESS_KEY_ID = os.environ["AWS_ACCESS_KEY_ID"]
15378AWS_SECRET_ACCESS_KEY = os.environ["AWS_SECRET_ACCESS_KEY"]
15379AWS_S3_REGION = os.environ["AWS_S3_REGION"]
15380AWS_S3_BUCKET_NAME = os.environ["AWS_S3_BUCKET_NAME"]
15381
15382MAX_FOCUS_POINTS_PER_HOUR = int(os.getenv("MAX_FOCUS_POINTS_PER_HOUR", 80)) # $80 / hour
15383FIXED_TAO_USD_ESTIMATE = float(os.getenv("FIXED_TAO_USD_ESTIMATE", 300.0))
15384BOOSTED_TASKS_PERCENTAGE = float(os.getenv("BOOSTED_TASKS_PERCENTAGE", 0.7))
15385
15386GOOGLE_PROJECT_ID = os.getenv("GOOGLE_PROJECT_ID")
15387GOOGLE_LOCATION = os.getenv("GOOGLE_LOCATION", "us-central1")
15388GOOGLE_APPLICATION_CREDENTIALS = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
15389GOOGLE_CLOUD_BUCKET_NAME = os.getenv("GOOGLE_CLOUD_BUCKET_NAME")
15390
15391with open(GOOGLE_APPLICATION_CREDENTIALS, "w") as f:
15392 f.write(get_secret("prod/gcp_service_user", region_name=AWS_S3_REGION))
15393
15394SENTRY_DSN = os.getenv("SENTRY_DSN")
15395IMPORT_SCORE = os.getenv("IMPORT_SCORE", "true").lower() == "true"
15396
15397
15398---
15399File: /validator-api/validator_api/dataset_upload.py
15400---
15401
15402from io import BytesIO
15403from typing import List
15404from datetime import datetime
15405import random
15406import tempfile
15407
15408from datasets import Dataset, Audio
15409from huggingface_hub import HfApi
15410import ulid
15411import soundfile as sf
15412import base64
15413
15414from omega.protocol import VideoMetadata, AudioMetadata
15415
15416from validator_api import config
15417
15418
15419HF_API = HfApi(token=config.HF_TOKEN)
15420NUM_BUCKETS = 1000
15421
15422
15423def get_data_path(batch_ulid_str: str) -> str:
15424 batch_ulid = ulid.from_str(batch_ulid_str)
15425 bucket = batch_ulid.int % NUM_BUCKETS
15426 return f"default/train/{bucket:03d}/{batch_ulid_str}.parquet"
15427
15428
15429def get_random_batch_size(batch_size: int) -> int:
15430 return random.choice([
15431 batch_size // 2,
15432 batch_size,
15433 batch_size * 2,
15434 ])
15435
15436def create_repo(name: str) -> None:
15437 try:
15438 HF_API.create_repo(
15439 repo_id=name,
15440 repo_type=config.REPO_TYPE,
15441 exist_ok=True,
15442 token=config.HF_TOKEN
15443 )
15444 print("Successfully created/verified repository")
15445 except Exception as e:
15446 print(f"Error creating repository: {e}")
15447
15448class DatasetUploader:
15449 def __init__(self):
15450 self.current_batch = []
15451 self.desired_batch_size = get_random_batch_size(config.UPLOAD_BATCH_SIZE)
15452 self.min_batch_size = 32
15453
15454 def add_videos(
15455 self, metadata: List[VideoMetadata], video_ids: List[str],
15456 description_relevance_scores: List[float], query_relevance_scores: List[float],
15457 query: str,
15458 ) -> None:
15459 curr_time = datetime.now()
15460 self.current_batch.extend([
15461 {
15462 "video_id": vid_uuid,
15463 "youtube_id": video.video_id,
15464 "description": video.description,
15465 "views": video.views,
15466 "start_time": video.start_time,
15467 "end_time": video.end_time,
15468 "video_embed": video.video_emb,
15469 "audio_embed": video.audio_emb,
15470 "description_embed": video.description_emb,
15471 "description_relevance_score": desc_score,
15472 "query_relevance_score": query_score,
15473 "query": query,
15474 "submitted_at": int(curr_time.timestamp()),
15475 }
15476 for vid_uuid, video, desc_score, query_score
15477 in zip(video_ids, metadata, description_relevance_scores, query_relevance_scores)
15478 ])
15479 print(f"Added {len(metadata)} videos to batch, now have {len(self.current_batch)}")
15480 if len(self.current_batch) >= self.desired_batch_size:
15481 self.submit()
15482
15483 def submit(self) -> None:
15484 if len(self.current_batch) < self.min_batch_size:
15485 print(f"Need at least {self.min_batch_size} videos to submit, but have {len(self.current_batch)}")
15486 return
15487 data = self.current_batch[:self.desired_batch_size]
15488 print(f"Uploading batch of {len(self.current_batch)} videos")
15489 with BytesIO() as f:
15490 dataset = Dataset.from_list(data)
15491 num_bytes = dataset.to_parquet(f)
15492 try:
15493 HF_API.upload_file(
15494 path_or_fileobj=f,
15495 path_in_repo=get_data_path(str(ulid.new())),
15496 repo_id=config.HF_REPO,
15497 repo_type=config.REPO_TYPE,
15498 token=config.HF_TOKEN,
15499 )
15500 print(f"Uploaded {num_bytes} bytes to Hugging Face")
15501 except Exception as e:
15502 print(f"Error uploading to Hugging Face: {e}")
15503 self.current_batch = self.current_batch[self.desired_batch_size:]
15504 self.desired_batch_size = get_random_batch_size(config.UPLOAD_BATCH_SIZE)
15505
15506
15507
15508class AudioDatasetUploader:
15509 def __init__(self):
15510 self.current_batch = []
15511 self.min_batch_size = 8
15512 self.desired_batch_size = get_random_batch_size(config.UPLOAD_AUDIO_BATCH_SIZE)
15513
15514 def convert_audio_to_wav(self, audio_bytes: str) -> bytes:
15515 temp_audiofile = tempfile.NamedTemporaryFile(suffix=".wav")
15516 audio_bytes = base64.b64decode(audio_bytes)
15517 with open(temp_audiofile.name, "wb") as f:
15518 f.write(audio_bytes)
15519 return temp_audiofile.read()
15520
15521 def add_audios(
15522 self, metadata: List[AudioMetadata], audio_ids: List[str],
15523 inverse_der: float, audio_length_score: float,
15524 audio_quality_total_score: float, audio_query_score: float,
15525 query: str, total_score: float
15526 ) -> None:
15527 curr_time = datetime.now()
15528
15529 audio_files = [self.convert_audio_to_wav(audio.audio_bytes) for audio in metadata]
15530
15531
15532
15533 self.current_batch.extend([
15534 {
15535 "audio_id": audio_uuid,
15536 "youtube_id": audio.video_id,
15537 # "audio_bytes": audio.audio_bytes,
15538 "audio": {"path": audio_file, "array": sf.read(BytesIO(base64.b64decode(audio.audio_bytes)))[0], "sampling_rate": 16000},
15539 "start_time": audio.start_time,
15540 "end_time": audio.end_time,
15541 "audio_embed": audio.audio_emb,
15542 "diar_timestamps_start": audio.diar_timestamps_start,
15543 "diar_timestamps_end": audio.diar_timestamps_end,
15544 "diar_speakers": audio.diar_speakers,
15545 "inverse_der": inverse_der,
15546 "audio_length_score": audio_length_score,
15547 "audio_quality_score": audio_quality_total_score,
15548 "query_relevance_score": audio_query_score,
15549 "total_score": total_score,
15550 "query": query,
15551 "submitted_at": int(curr_time.timestamp()),
15552 }
15553 for audio_uuid, audio_file, audio in zip(audio_ids, audio_files, metadata)
15554 ])
15555 print(f"Added {len(metadata)} audios to batch, now have {len(self.current_batch)}")
15556 if len(self.current_batch) >= self.desired_batch_size:
15557 self.submit()
15558
15559 def submit(self) -> None:
15560 if len(self.current_batch) < self.min_batch_size:
15561 print(f"Need at least {self.min_batch_size} audios to submit, but have {len(self.current_batch)}")
15562 return
15563 data = self.current_batch[:self.desired_batch_size]
15564 print(f"Uploading batch of {len(self.current_batch)} audios")
15565 with BytesIO() as f:
15566 dataset = Dataset.from_list(data)
15567 dataset = dataset.cast_column("audio", Audio())
15568 num_bytes = dataset.to_parquet(f)
15569 try:
15570 HF_API.upload_file(
15571 path_or_fileobj=f,
15572 path_in_repo=get_data_path(str(ulid.new())),
15573 repo_id=config.HF_AUDIO_REPO,
15574 repo_type=config.REPO_TYPE,
15575 token=config.HF_TOKEN,
15576 )
15577 print(f"Uploaded {num_bytes} bytes to Hugging Face")
15578 except Exception as e:
15579 print(f"Error uploading to Hugging Face: {e}")
15580 self.current_batch = self.current_batch[self.desired_batch_size:]
15581 self.desired_batch_size = get_random_batch_size(config.UPLOAD_AUDIO_BATCH_SIZE)
15582
15583
15584
15585
15586audio_dataset_uploader = AudioDatasetUploader()
15587video_dataset_uploader = DatasetUploader()
15588
15589
15590
15591---
15592File: /validator-api/validator_api/imagebind_loader.py
15593---
15594
15595from typing import Optional
15596from fastapi import HTTPException
15597import asyncio
15598import threading
15599from concurrent.futures import ThreadPoolExecutor
15600from omega.imagebind_wrapper import ImageBind
15601
15602
15603class ImageBindLoader:
15604 def __init__(self):
15605 self._imagebind: Optional[ImageBind] = None
15606 self._loading_task: Optional[asyncio.Task] = None
15607 self._lock = asyncio.Lock()
15608 self._thread_pool = ThreadPoolExecutor(max_workers=1)
15609
15610 async def get_imagebind(self) -> ImageBind:
15611 """
15612 Asynchronously get or initialize ImageBind instance.
15613 Handles concurrent requests efficiently.
15614 """
15615 if self._imagebind is not None:
15616 return self._imagebind
15617
15618 if self._loading_task is None:
15619 self._loading_task = asyncio.create_task(self._load_imagebind_wrapper())
15620
15621 raise HTTPException(
15622 status_code=503,
15623 detail="ImageBind loading has started. Please try again later."
15624 )
15625
15626 def _load_imagebind_blocking(self) -> ImageBind:
15627 """Blocking method to load ImageBind in a separate thread."""
15628 return ImageBind(v2=True)
15629
15630 async def _load_imagebind_wrapper(self) -> None:
15631 """Wrapper to run the blocking load in a thread pool."""
15632 try:
15633 # Run the blocking operation in a thread pool
15634 loop = asyncio.get_running_loop()
15635 self._imagebind = await loop.run_in_executor(
15636 self._thread_pool,
15637 self._load_imagebind_blocking
15638 )
15639 finally:
15640 self._loading_task = None
15641
15642
15643
15644---
15645File: /validator-api/validator_api/limiter.py
15646---
15647
15648from slowapi import Limiter
15649from slowapi.util import get_remote_address
15650
15651limiter = Limiter(key_func=get_remote_address)
15652
15653
15654
15655---
15656File: /validator-api/validator_api/score.py
15657---
15658
15659import asyncio
15660import random
15661import uuid
15662from typing import List, Tuple, Optional, BinaryIO
15663import math
15664
15665from pinecone import Pinecone
15666import torch
15667import torch.nn.functional as F
15668import soundfile as sf
15669from io import BytesIO
15670
15671from omega.protocol import Videos, VideoMetadata, AudioMetadata, Audios
15672from omega import video_utils, unstuff
15673from omega.constants import (
15674 MAX_VIDEO_LENGTH,
15675 MIN_VIDEO_LENGTH,
15676 DIFFERENCE_THRESHOLD,
15677 SIMILARITY_THRESHOLD,
15678 VIDEO_DOWNLOAD_TIMEOUT,
15679 MIN_SCORE,
15680 FAKE_VIDEO_PUNISHMENT,
15681 QUERY_RELEVANCE_SCALING_FACTOR,
15682 DESCRIPTION_RELEVANCE_SCALING_FACTOR,
15683 VIDEO_RELEVANCE_WEIGHT,
15684 DESCRIPTION_LENGTH_WEIGHT,
15685 MIN_LENGTH_BOOST_TOKEN_COUNT,
15686 MAX_LENGTH_BOOST_TOKEN_COUNT,
15687 STUFFED_DESCRIPTION_PUNISHMENT,
15688 DIARIZATION_SCALING_FACTOR,
15689 AUDIO_LENGTH_SCALING_FACTOR,
15690 AUDIO_QUALITY_SCALING_FACTOR,
15691 AUDIO_QUERY_RELEVANCE_SCALING_FACTOR,
15692 SPEECH_CONTENT_SCALING_FACTOR,
15693 SPEAKER_DOMINANCE_SCALING_FACTOR,
15694 BACKGROUND_NOISE_SCALING_FACTOR,
15695 MAX_AUDIO_LENGTH_SECONDS,
15696 MIN_AUDIO_LENGTH_SECONDS
15697)
15698from omega.imagebind_wrapper import ImageBind, Embeddings, run_async, LENGTH_TOKENIZER
15699from omega.text_similarity import get_text_similarity_score
15700from validator_api import config
15701from validator_api.dataset_upload import video_dataset_uploader, audio_dataset_uploader
15702from omega.audio_scoring import AudioScore
15703from omega.diarization_metric import calculate_diarization_metrics
15704
15705
15706
15707
15708PINECONE_INDEX = Pinecone(api_key=config.PINECONE_API_KEY).Index(config.PINECONE_INDEX)
15709PINECONE_AUDIO_INDEX = Pinecone(api_key=config.PINECONE_API_KEY).Index(config.PINECONE_AUDIO_INDEX)
15710GPU_SEMAPHORE = asyncio.Semaphore(1)
15711DOWNLOAD_SEMAPHORE = asyncio.Semaphore(5)
15712VIDEO_TYPE = "video"
15713AUDIO_TYPE = "audio"
15714DESCRIPTION_TYPE = "description"
15715
15716
15717async def query_pinecone(vector: List[float]) -> float:
15718 response = await run_async(
15719 PINECONE_INDEX.query,
15720 vector=vector,
15721 top_k=1,
15722 filter={
15723 "modality_type": {"$eq": VIDEO_TYPE},
15724 },
15725 )
15726 if len(response["matches"]) > 0:
15727 return 1 - response["matches"][0]["score"]
15728 else:
15729 print("No pinecone matches, returning 0")
15730 return 0
15731
15732async def get_pinecone_novelty(metadata: List[VideoMetadata]) -> List[float]:
15733 """
15734 Take the top match from the Pinecone index.
15735 """
15736 novelty_scores = await asyncio.gather(*[
15737 query_pinecone(
15738 vector=mdata.video_emb
15739 )
15740 for mdata in metadata
15741 ])
15742 return novelty_scores
15743
15744def compute_novelty_score_among_batch(emb: Embeddings) -> List[float]:
15745 video_tensor = emb.video
15746 num_videos = video_tensor.shape[0]
15747 novelty_scores = []
15748 for i in range(num_videos - 1):
15749 similarity_score = F.cosine_similarity(video_tensor[[i]], video_tensor[i + 1:]).max()
15750 novelty_scores.append(1 - similarity_score.item())
15751 novelty_scores.append(1.0) # last video is 100% novel
15752 return novelty_scores
15753
15754async def async_zero() -> None:
15755 return 0
15756
15757async def compute_novelty_score(embeddings: Embeddings) -> Tuple[float, List[bool]]:
15758 local_novelty_scores = compute_novelty_score_among_batch(embeddings)
15759 global_novelty_scores = await asyncio.gather(*[
15760 async_zero() if local_score < DIFFERENCE_THRESHOLD else # don't even query Pinecone if it's already too similar
15761 query_pinecone(vector=embedding.tolist())
15762 for embedding, local_score in zip(embeddings.video, local_novelty_scores)
15763 ])
15764 true_novelty_scores = [
15765 min(local_score, global_score) for local_score, global_score
15766 in zip(local_novelty_scores, global_novelty_scores)
15767 ]
15768 is_too_similar = [score < DIFFERENCE_THRESHOLD for score in true_novelty_scores]
15769 novelty_score = sum([
15770 score for score, is_too_similar
15771 in zip(true_novelty_scores, is_too_similar)
15772 if not is_too_similar
15773 ])
15774 return novelty_score, is_too_similar
15775
15776
15777def upload_to_pinecone(embeddings: Embeddings, metadata: List[VideoMetadata]) -> None:
15778 video_ids = [str(uuid.uuid4()) for _ in range(len(metadata))]
15779 try:
15780 PINECONE_INDEX.upsert(
15781 vectors=sum([
15782 [
15783 {
15784 "id": f"{modality_type[:3]}{video_uuid}",
15785 "values": emb.tolist(),
15786 "metadata": {
15787 "youtube_id": video.video_id,
15788 "modality_type": modality_type,
15789 }
15790 }
15791 for emb, modality_type
15792 in zip(
15793 [embedding_vid, embedding_aud, embedding_des],
15794 [VIDEO_TYPE, AUDIO_TYPE, DESCRIPTION_TYPE]
15795 )
15796 ]
15797 for video_uuid, video, embedding_vid, embedding_aud, embedding_des
15798 in zip(video_ids, metadata, embeddings.video, embeddings.audio, embeddings.description)
15799 ], []),
15800 )
15801 except Exception as e:
15802 print(f"Failed to upload to Pinecone: {e}")
15803 return video_ids
15804
15805
15806def upload_to_pinecone_audio(embeddings: Embeddings, metadata: List[AudioMetadata]) -> None:
15807 audio_ids = [str(uuid.uuid4()) for _ in range(len(metadata))]
15808 try:
15809 PINECONE_AUDIO_INDEX.upsert(
15810 vectors=[
15811 {
15812 "id": f"{audio_uuid}",
15813 "values": embedding_aud.tolist(),
15814 "metadata": {
15815 "youtube_id": audio.video_id,
15816 }
15817 }
15818 for audio_uuid, audio, embedding_aud
15819 in zip(audio_ids, metadata, embeddings.audio)
15820 ],
15821 )
15822 except Exception as e:
15823 print(f"Failed to upload to Pinecone: {e}")
15824 return audio_ids
15825
15826async def upload_video_metadata(
15827 metadata: List[VideoMetadata],
15828 description_relevance_scores: List[float],
15829 query_relevance_scores: List[float],
15830 query: str,
15831) -> None:
15832 # generate embeddings from our metadata
15833 embeddings = Embeddings(
15834 video=torch.stack([torch.tensor(v.video_emb) for v in metadata]),
15835 audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]),
15836 description=torch.stack([torch.tensor(v.description_emb) for v in metadata]),
15837 )
15838 # upload embeddings and metadata to pinecone
15839 video_ids = await run_async(upload_to_pinecone, embeddings, metadata)
15840 # Schedule upload to HuggingFace
15841 video_dataset_uploader.add_videos(
15842 metadata,
15843 video_ids,
15844 description_relevance_scores,
15845 query_relevance_scores,
15846 query,
15847 )
15848 return video_ids
15849
15850async def upload_audio_metadata(
15851 metadata: List[AudioMetadata],
15852 inverse_der: float, audio_length_score: float,
15853 audio_quality_total_score: float,
15854 audio_query_score: float,
15855 query: str,
15856 total_score: float
15857) -> None:
15858 embeddings = Embeddings(
15859 video=None,
15860 audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]),
15861 description=None,
15862 )
15863 audio_ids = await run_async(upload_to_pinecone_audio, embeddings, metadata)
15864 audio_ids = [str(uuid.uuid4()) for _ in range(len(metadata))]
15865 audio_dataset_uploader.add_audios(
15866 metadata,
15867 audio_ids,
15868 inverse_der,
15869 audio_length_score,
15870 audio_quality_total_score,
15871 audio_query_score,
15872 query,
15873 total_score
15874 )
15875 return audio_ids
15876
15877
15878def filter_embeddings(embeddings: Embeddings, is_too_similar: List[bool]) -> Embeddings:
15879 """Filter the embeddings based on whether they are too similar to the query."""
15880 is_too_similar = torch.tensor(is_too_similar)
15881 if embeddings.video is not None:
15882 embeddings.video = embeddings.video[~is_too_similar]
15883 if embeddings.audio is not None:
15884 embeddings.audio = embeddings.audio[~is_too_similar]
15885 if embeddings.description is not None:
15886 embeddings.description = embeddings.description[~is_too_similar]
15887 return embeddings
15888
15889
15890def filter_stuffed_embeddings(embeddings: Embeddings, stuffed: List[Tuple[bool, float]]) -> Embeddings:
15891 """Filter the embeddings based on whether they are too similar to the query."""
15892 stuffed = torch.tensor([s for s, _ in stuffed])
15893 if embeddings.video is not None:
15894 embeddings.video = embeddings.video[~stuffed]
15895 if embeddings.audio is not None:
15896 embeddings.audio = embeddings.audio[~stuffed]
15897 if embeddings.description is not None:
15898 embeddings.description = embeddings.description[~stuffed]
15899 return embeddings
15900
15901def is_similar(emb_1: torch.Tensor, emb_2: List[float]) -> bool:
15902 return F.cosine_similarity(
15903 emb_1,
15904 torch.tensor(emb_2, device=emb_1.device).unsqueeze(0)
15905 ) > SIMILARITY_THRESHOLD
15906
15907
15908def strict_is_similar(emb_1: torch.Tensor, emb_2: List[float]) -> bool:
15909 return torch.allclose(emb_1, torch.tensor(emb_2, device=emb_1.device), atol=1e-4)
15910
15911
15912def metadata_check(metadata: List[VideoMetadata]) -> List[VideoMetadata]:
15913 return [
15914 video_metadata for video_metadata in metadata
15915 if (
15916 video_metadata.end_time - video_metadata.start_time <= MAX_VIDEO_LENGTH and
15917 video_metadata.end_time - video_metadata.start_time >= MIN_VIDEO_LENGTH
15918 )
15919 ]
15920
15921
15922def audio_metadata_check(metadata: List[AudioMetadata]) -> List[AudioMetadata]:
15923 return [
15924 audio_metadata for audio_metadata in metadata
15925 if (
15926 audio_metadata.end_time - audio_metadata.start_time <= MAX_VIDEO_LENGTH and
15927 audio_metadata.end_time - audio_metadata.start_time >= MIN_VIDEO_LENGTH
15928 )
15929 ]
15930
15931def deduplicate_audios(embeddings: Embeddings) -> List[bool]:
15932 # return a list of booleans where True means the corresponding video is a duplicate i.e. is_similar
15933 audio_tensor = embeddings.audio
15934 num_audios = audio_tensor.shape[0]
15935 # cossim = CosineSimilarity(dim=1)
15936 is_similar = []
15937 for i in range(num_audios):
15938 similarity_score = F.cosine_similarity(audio_tensor[[i]], audio_tensor[i + 1:]).max()
15939 has_duplicates = (similarity_score > SIMILARITY_THRESHOLD).any()
15940 is_similar.append(has_duplicates.item())
15941
15942 return is_similar
15943
15944def compute_novelty_score_among_batch_audio(emb: Embeddings) -> List[float]:
15945 audio_tensor = emb.audio
15946 num_audios = audio_tensor.shape[0]
15947 novelty_scores = []
15948 for i in range(num_audios - 1):
15949 similarity_score = F.cosine_similarity(audio_tensor[[i]], audio_tensor[i + 1:]).max()
15950 novelty_scores.append(1 - similarity_score.item())
15951 novelty_scores.append(1.0) # last video is 100% novel
15952 return novelty_scores
15953
15954def get_proxy_url() -> str:
15955 return random.choice(config.PROXY_LIST + [None])
15956
15957
15958async def get_random_video(metadata: List[VideoMetadata], check_video: bool) -> Optional[Tuple[VideoMetadata, Optional[BinaryIO]]]:
15959 if not check_video:
15960 random_metadata = random.choice(metadata)
15961 return random_metadata, None
15962
15963 random_video = None
15964 metadata_copy = [v for v in metadata] # list shallow copy
15965 while random_video is None and len(metadata_copy) > 0:
15966 idx = random.randint(0, len(metadata_copy) - 1)
15967 random_metadata = metadata_copy.pop(idx)
15968 try:
15969 async with DOWNLOAD_SEMAPHORE:
15970 random_video = await asyncio.wait_for(run_async(
15971 video_utils.download_youtube_video,
15972 random_metadata.video_id,
15973 random_metadata.start_time,
15974 random_metadata.end_time,
15975 proxy=get_proxy_url(),
15976 ), timeout=VIDEO_DOWNLOAD_TIMEOUT)
15977 except video_utils.IPBlockedException:
15978 # IP is blocked, cannot download video, check description only
15979 print("WARNING: IP is blocked, cannot download video, checking description only")
15980 return random_metadata, None
15981 except video_utils.FakeVideoException:
15982 print(f"WARNING: Video {random_metadata.video_id} is fake, punishing miner")
15983 return None
15984 except asyncio.TimeoutError:
15985 continue
15986
15987 # IP is not blocked, video is not fake, but video download failed for some reason. We don't
15988 # know why it failed so we won't punish the miner, but we will check the description only.
15989 if random_video is None:
15990 return random_metadata, None
15991
15992 return random_metadata, random_video
15993
15994
15995async def random_check(random_meta_and_vid: List[VideoMetadata], imagebind: ImageBind) -> bool:
15996 random_metadata, random_video = random_meta_and_vid
15997
15998 if random_video is None:
15999 desc_embeddings = await imagebind.embed_text_async([random_metadata.description])
16000 is_similar_ = is_similar(desc_embeddings, random_metadata.description_emb)
16001 strict_is_similar_ = strict_is_similar(desc_embeddings, random_metadata.description_emb)
16002 print(f"Description similarity: {is_similar_}, strict description similarity: {strict_is_similar_}")
16003 return is_similar_
16004
16005 # Video downloaded, check all embeddings
16006 embeddings = await imagebind.embed_async([random_metadata.description], [random_video])
16007 is_similar_ = (
16008 is_similar(embeddings.video, random_metadata.video_emb) and
16009 is_similar(embeddings.audio, random_metadata.audio_emb) and
16010 is_similar(embeddings.description, random_metadata.description_emb)
16011 )
16012 strict_is_similar_ = (
16013 strict_is_similar(embeddings.video, random_metadata.video_emb) and
16014 strict_is_similar(embeddings.audio, random_metadata.audio_emb) and
16015 strict_is_similar(embeddings.description, random_metadata.description_emb)
16016 )
16017 print(f"Total similarity: {is_similar_}, strict total similarity: {strict_is_similar_}")
16018 return is_similar_
16019
16020
16021async def get_num_unique_videos(videos: Videos) -> int:
16022 metadata = videos.video_metadata
16023 embeddings = Embeddings(
16024 video=torch.stack([torch.tensor(v.video_emb) for v in metadata]),
16025 audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]),
16026 description=torch.stack([torch.tensor(v.description_emb) for v in metadata]),
16027 )
16028 novelty_score, is_too_similar = await compute_novelty_score(embeddings)
16029 return sum([not is_sim for is_sim in is_too_similar])
16030
16031
16032async def _run_video_scoring(videos: Videos, imagebind: ImageBind, is_check_only: bool) -> float:
16033
16034 # check video_ids for fake videos
16035 if any(not video_utils.is_valid_youtube_id(video.video_id) for video in videos.video_metadata):
16036 return {"score": FAKE_VIDEO_PUNISHMENT}
16037
16038 metadata = metadata_check(videos.video_metadata)[:videos.num_videos]
16039 print(f"Filtered {len(videos.video_metadata)} videos down to {len(metadata)} videos")
16040
16041 # return minimum score if no videos were found in video_metadata
16042 if len(metadata) == 0:
16043 return {"score": MIN_SCORE}
16044
16045 check_video = config.CHECK_PROBABILITY > random.random()
16046 random_meta_and_vid = await get_random_video(metadata, check_video)
16047 if random_meta_and_vid is None:
16048 return {"score": FAKE_VIDEO_PUNISHMENT}
16049
16050 async with GPU_SEMAPHORE:
16051 passed_check = await random_check(random_meta_and_vid, imagebind)
16052 if not passed_check:
16053 return {"score": FAKE_VIDEO_PUNISHMENT}
16054
16055 query_emb = await imagebind.embed_text_async([videos.query])
16056
16057 # Upload the videos to Pinecone and deduplicate
16058 original_length = len(metadata)
16059 embeddings = Embeddings(
16060 video=torch.stack([torch.tensor(v.video_emb) for v in metadata]).to(imagebind.device),
16061 audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]).to(imagebind.device),
16062 description=torch.stack([torch.tensor(v.description_emb) for v in metadata]).to(imagebind.device),
16063 )
16064 novelty_score, is_too_similar = await compute_novelty_score(embeddings)
16065 embeddings = filter_embeddings(embeddings, is_too_similar)
16066 metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
16067 print(f"Deduplicated {original_length} videos down to {len(metadata)} videos")
16068
16069 # Filter out "stuffed" descriptions.
16070 pre_filter_metadata_length = len(metadata)
16071 stuffed = [
16072 unstuff.is_stuffed(meta.description)
16073 for meta in metadata
16074 ]
16075 if any([garbage and confidence > 0.75 for garbage, confidence in stuffed]):
16076 print("Stuffed description found with high confidence, penalizing the miner.")
16077 return {"score": STUFFED_DESCRIPTION_PUNISHMENT}
16078
16079 # More stuffing.
16080 extraneous = [
16081 unstuff.check_extraneous_chunks(meta.description, meta.video_emb, meta.audio_emb, imagebind)
16082 for meta in metadata
16083 ]
16084 for really_bad, low_quality, total in extraneous:
16085 if really_bad > 5 or low_quality >= 16:
16086 print(f"Extraneous garbage found in text check {really_bad=} {low_quality=} {total=}")
16087 return {"score": STUFFED_DESCRIPTION_PUNISHMENT}
16088
16089 metadata = [
16090 metadata[idx]
16091 for idx in range(len(metadata))
16092 if not stuffed[idx][0]
16093 and extraneous[idx][1] <= 15
16094 and extraneous[idx][2] <= 50
16095 ]
16096 if len(metadata) < pre_filter_metadata_length:
16097 print(f"Filtering {pre_filter_metadata_length} videos down to {len(metadata)} videos to remove token-stuffed descriptions.")
16098 if len(metadata) == 0:
16099 return {"score": MIN_SCORE}
16100
16101 embeddings = filter_stuffed_embeddings(embeddings, stuffed)
16102
16103 # Compute relevance scores
16104 video_description_relevance_scores = F.cosine_similarity(
16105 embeddings.video, embeddings.description
16106 ).tolist()
16107 audio_description_relevance_scores = F.cosine_similarity(
16108 embeddings.audio, embeddings.description
16109 ).tolist()
16110 video_query_relevance_scores = F.cosine_similarity(
16111 embeddings.video, query_emb
16112 ).tolist()
16113 audio_query_relevance_scores = F.cosine_similarity(
16114 embeddings.audio, query_emb
16115 ).tolist()
16116
16117 # Query relevance score now includes video cosim, audio cosim, and text cosim using higher quality text-only model.
16118 query_relevance_scores = [
16119 sum([
16120 video_query_relevance_scores[idx],
16121 audio_query_relevance_scores[idx],
16122 get_text_similarity_score(metadata[idx].description, videos.query),
16123 ]) / 3
16124 for idx in range(len(video_query_relevance_scores))
16125 ]
16126
16127 # Combine audio & visual description scores, weighted towards visual.
16128 description_relevance_scores = [
16129 sum([
16130 video_description_relevance_scores[idx] * VIDEO_RELEVANCE_WEIGHT,
16131 audio_description_relevance_scores[idx] * (1.0 - VIDEO_RELEVANCE_WEIGHT),
16132 ])
16133 for idx in range(len(video_description_relevance_scores))
16134 ]
16135
16136 # Scale description scores by number of unique tokens.
16137 length_scalers = []
16138 for idx in range(len(description_relevance_scores)):
16139 unique_tokens = LENGTH_TOKENIZER(metadata[idx].description)
16140 unique_tokens = set(unique_tokens[unique_tokens != 0][1:-1].tolist())
16141 unique_token_count = len(unique_tokens)
16142 if unique_token_count <= MIN_LENGTH_BOOST_TOKEN_COUNT:
16143 print(f"Very few tokens, applying {DESCRIPTION_LENGTH_WEIGHT} penalty.")
16144 description_relevance_scores[idx] *= (1.0 - DESCRIPTION_LENGTH_WEIGHT)
16145 length_scalers.append(0)
16146 continue
16147 length_scaler = min(math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2), math.log(unique_token_count, 2)) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2)
16148 length_scaler /= (math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2))
16149 length_scalers.append(length_scaler)
16150 print(f"Description length scaling factor = {length_scaler}")
16151 description_relevance_scores[idx] -= description_relevance_scores[idx] * DESCRIPTION_LENGTH_WEIGHT * (1.0 - length_scaler)
16152
16153 # Aggregate scores
16154 score = (
16155 (sum(description_relevance_scores) * DESCRIPTION_RELEVANCE_SCALING_FACTOR) +
16156 (sum(query_relevance_scores) * QUERY_RELEVANCE_SCALING_FACTOR)
16157 ) / 2 / videos.num_videos
16158
16159 print(f'''
16160 is_unique: {[not is_sim for is_sim in is_too_similar]},
16161 video cosine sim: {video_description_relevance_scores},
16162 audio cosine sim: {audio_description_relevance_scores},
16163 description relevance scores: {description_relevance_scores},
16164 query relevance scores: {query_relevance_scores},
16165 length scalers: {length_scalers},
16166 total score: {score}
16167 ''')
16168
16169 if not is_check_only and len(metadata) > 0:
16170 video_ids = await run_async(upload_to_pinecone, embeddings, metadata)
16171 # Schedule upload to HuggingFace
16172 video_dataset_uploader.add_videos(
16173 metadata,
16174 video_ids,
16175 description_relevance_scores,
16176 query_relevance_scores,
16177 videos.query,
16178 )
16179 score = max(score, MIN_SCORE)
16180
16181 if score > 0.4:
16182 print(f"Videos with score > 0.4: {metadata}")
16183
16184 return {
16185 "is_unique": [not is_sim for is_sim in is_too_similar],
16186 "description_relevance_scores": description_relevance_scores,
16187 "query_relevance_scores": query_relevance_scores,
16188 "score": score,
16189 }
16190
16191
16192async def _run_audio_scoring(audios: Audios, imagebind: ImageBind, is_check_only: bool = False) -> float:
16193 """Score audio submissions and optionally upload them.
16194
16195 Args:
16196 audios: The audio submissions to score
16197 imagebind: ImageBind model for embeddings
16198 is_check_only: If True, only score without uploading
16199
16200 Returns:
16201 Either the final score (float) or a dict with detailed scoring info
16202 """
16203 if len(audios.audio_metadata) == 0:
16204 return MIN_SCORE
16205
16206 # Check for valid YouTube IDs
16207 if any(not video_utils.is_valid_youtube_id(audio.video_id) for audio in audios.audio_metadata):
16208 return FAKE_VIDEO_PUNISHMENT
16209
16210
16211 # Check audio metadata and filter out invalid ones
16212 metadata = audio_metadata_check(audios.audio_metadata)[:audios.num_audios]
16213 print(f"Filtered {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
16214
16215
16216 # execute the random check on metadata and video
16217 async with GPU_SEMAPHORE:
16218 query_emb = await imagebind.embed_text_async([audios.query])
16219
16220 embeddings = Embeddings(
16221 video=None,
16222 audio=torch.stack([torch.tensor(a.audio_emb) for a in metadata]).to(imagebind.device),
16223 description=None
16224 )
16225
16226 # check and deduplicate videos based on embedding similarity checks. We do this because we're not uploading to pinecone first.
16227 metadata_is_similar = await deduplicate_audios(embeddings)
16228 metadata = [metadata for metadata, too_similar in zip(metadata, metadata_is_similar) if not too_similar]
16229 embeddings = filter_embeddings(embeddings, metadata_is_similar)
16230
16231 if len(metadata) < len(audios.audio_metadata):
16232 print(f"Deduplicated {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
16233
16234 if len(metadata) == 0:
16235 return MIN_SCORE
16236
16237 # first get local novelty scores
16238 local_novelty_scores = compute_novelty_score_among_batch_audio(embeddings)
16239 pre_filter_metadata_length = len(metadata)
16240 # check scores from index for being too similar
16241 is_too_similar = [score < DIFFERENCE_THRESHOLD for score in local_novelty_scores]
16242 # filter out metadata too similar
16243 metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
16244 # filter out embeddings too similar
16245 embeddings = filter_embeddings(embeddings, is_too_similar)
16246 if len(metadata) < pre_filter_metadata_length:
16247 print(f"Filtering {pre_filter_metadata_length} audios down to {len(metadata)} audios that are too similar to audios in our index.")
16248
16249 # return minimum score if no unique videos were found
16250 if len(metadata) == 0:
16251 return MIN_SCORE
16252
16253 # Filter metadata based on length constraints
16254 metadata = [
16255 meta for meta in audios.audio_metadata[:audios.num_audios]
16256 if (meta.end_time - meta.start_time) >= MIN_AUDIO_LENGTH_SECONDS
16257 and (meta.end_time - meta.start_time) <= MAX_AUDIO_LENGTH_SECONDS
16258 ]
16259
16260 if len(metadata) == 0:
16261 return MIN_SCORE
16262
16263 total_audio_length = sum((meta.end_time - meta.start_time) for meta in metadata)
16264 print(f"Average audio length: {total_audio_length/len(metadata):.2f} seconds")
16265 audio_length_score = total_audio_length/(audios.num_audios*MAX_AUDIO_LENGTH_SECONDS)
16266
16267 audio_query_score = sum(F.cosine_similarity(
16268 embeddings.audio, query_emb
16269 ).tolist())/len(metadata)
16270 print(f"Audio query score: {audio_query_score}")
16271
16272 # Randomly sample one audio for duration check
16273 selected_random_meta = random.choice(metadata)
16274 audio_array, sr = sf.read(BytesIO(selected_random_meta.audio_bytes))
16275 audio_duration = len(audio_array) / sr
16276 print(f"Selected Youtube Video: {selected_random_meta.video_id}, Duration: {audio_duration:.2f} seconds")
16277
16278 audio_quality_scores = AudioScore().total_score(
16279 audio_array,
16280 sr,
16281 selected_random_meta.diar_timestamps_start,
16282 selected_random_meta.diar_timestamps_end,
16283 selected_random_meta.diar_speakers
16284 )
16285 audio_quality_total_score = (
16286 audio_quality_scores["speech_content_score"] * SPEECH_CONTENT_SCALING_FACTOR +
16287 audio_quality_scores["speaker_dominance_score"] * SPEAKER_DOMINANCE_SCALING_FACTOR +
16288 audio_quality_scores["background_noise_score"] * BACKGROUND_NOISE_SCALING_FACTOR
16289 )
16290
16291 miner_diar_segment = {
16292 "start": selected_random_meta.diar_timestamps_start,
16293 "end": selected_random_meta.diar_timestamps_end,
16294 "speakers": selected_random_meta.diar_speakers
16295 }
16296
16297 diarization_score = calculate_diarization_metrics(
16298 audio_array,
16299 sr,
16300 miner_diar_segment
16301 )
16302 inverse_der = diarization_score["inverse_der"]
16303 total_score = (
16304 DIARIZATION_SCALING_FACTOR * inverse_der +
16305 AUDIO_LENGTH_SCALING_FACTOR * audio_length_score +
16306 AUDIO_QUALITY_SCALING_FACTOR * audio_quality_total_score +
16307 AUDIO_QUERY_RELEVANCE_SCALING_FACTOR * audio_query_score
16308 )
16309
16310 print(f'''
16311 is_unique: {[not is_sim for is_sim in is_too_similar]},
16312 audio_query_score: {audio_query_score},
16313 audio_length_score: {audio_length_score},
16314 audio_quality_score: {audio_quality_total_score},
16315 diarization_score: {inverse_der},
16316 total score: {total_score}
16317 ''')
16318
16319 if not is_check_only and len(metadata) > 0:
16320 # Upload metadata and schedule dataset upload
16321 audio_ids = await run_async(upload_to_pinecone_audio, embeddings, metadata)
16322
16323 audio_dataset_uploader.add_audios(
16324 metadata,
16325 audio_ids,
16326 inverse_der,
16327 audio_length_score,
16328 audio_quality_total_score,
16329 audio_query_score,
16330 audios.query,
16331 total_score,
16332 )
16333 total_score = max(total_score, MIN_SCORE)
16334
16335 if total_score > 0.4:
16336 print(f"Audios with score > 0.4: {metadata}")
16337
16338 return {
16339 "is_unique": [not is_sim for is_sim in is_too_similar],
16340 "audio_query_score": audio_query_score,
16341 "audio_length_score": audio_length_score,
16342 "audio_quality_score": audio_quality_total_score,
16343 "diarization_score": inverse_der,
16344 "score": total_score
16345 }
16346
16347
16348async def score_videos_for_testing(videos: Videos, imagebind: ImageBind) -> float:
16349 return await _run_video_scoring(videos, imagebind, is_check_only=True)
16350
16351
16352async def score_and_upload_videos(videos: Videos, imagebind: ImageBind) -> float:
16353 scores_dict = await _run_video_scoring(videos, imagebind, is_check_only=False)
16354 return scores_dict["score"]
16355
16356
16357async def score_audios_for_testing(audios: Audios, imagebind: ImageBind) -> float:
16358 return await _run_audio_scoring(audios, imagebind, is_check_only=True)
16359
16360
16361async def score_and_upload_audios(audios: Audios, imagebind: ImageBind) -> float:
16362 scores_dict = await _run_audio_scoring(audios, imagebind, is_check_only=False)
16363 return scores_dict["score"]
16364
16365
16366---
16367File: /validator-api/_generate_api_key.py
16368---
16369
16370import secrets
16371
16372def generate_api_key():
16373 return secrets.token_urlsafe(32) # Generates a 32-byte (256-bit) key
16374
16375new_api_key = generate_api_key()
16376print(new_api_key)
16377
16378
16379---
16380File: /validator-api/app.py
16381---
16382
16383import asyncio
16384import requests
16385import os
16386import json
16387from datetime import datetime
16388import time
16389from typing import Annotated, List, Optional, Dict, Any
16390import random
16391import json
16392from pydantic import BaseModel
16393import traceback
16394from threading import Lock
16395
16396from tempfile import TemporaryDirectory
16397import huggingface_hub
16398from datasets import load_dataset
16399import ulid
16400
16401from traceback import print_exception
16402
16403import bittensor
16404import uvicorn
16405from fastapi import FastAPI, HTTPException, Depends, Body, Path, Security, BackgroundTasks, Request
16406from fastapi.security import HTTPBasicCredentials, HTTPBasic
16407from fastapi.security.api_key import APIKeyHeader
16408from fastapi.staticfiles import StaticFiles
16409from fastapi.responses import FileResponse
16410from starlette import status
16411from substrateinterface import Keypair
16412
16413import sentry_sdk
16414
16415from sqlalchemy.orm import Session
16416from validator_api.database import get_db, get_db_context
16417from validator_api.database.crud.focusvideo import (
16418 get_all_available_focus, check_availability, get_video_owner_coldkey,
16419 already_purchased_max_focus_tao, get_miner_purchase_stats, MinerPurchaseStats,
16420 set_focus_video_score, mark_video_rejected, mark_video_submitted, TaskType
16421)
16422from validator_api.utils.marketplace import get_max_focus_tao, TASK_TYPE_MAP, get_purchase_max_focus_tao
16423from validator_api.cron.confirm_purchase import confirm_transfer, confirm_video_purchased
16424from validator_api.services.scoring_service import FocusScoringService, VideoUniquenessError
16425
16426from validator_api.communex.client import CommuneClient
16427from validator_api.communex._common import get_node_url
16428
16429from omega.protocol import Videos, VideoMetadata, AudioMetadata
16430from validator_api.imagebind_loader import ImageBindLoader
16431
16432from validator_api.config import (
16433 NETWORK, NETUID,
16434 ENABLE_COMMUNE, COMMUNE_NETWORK, COMMUNE_NETUID,
16435 API_KEY_NAME, API_KEYS, DB_CONFIG,
16436 TOPICS_LIST, PROXY_LIST, IS_PROD,
16437 FOCUS_REWARDS_PERCENT, FOCUS_API_KEYS,
16438 SENTRY_DSN, IMPORT_SCORE
16439)
16440
16441print("IMPORT_SCORE:", IMPORT_SCORE)
16442
16443if IMPORT_SCORE is not False:
16444 from validator_api import score
16445else:
16446 # remove cuda error on mac
16447 score = None
16448
16449from validator_api.dataset_upload import video_dataset_uploader, audio_dataset_uploader
16450from validator_api.limiter import limiter
16451
16452
16453### Constants for OMEGA Metadata Dashboard ###
16454HF_DATASET = "omegalabsinc/omega-multimodal"
16455DATA_FILES_PREFIX = "default/train/"
16456MAX_FILES = 1
16457CACHE_FILE = "desc_embeddings_recent.json"
16458MIN_AGE = 60 * 60 * 48 # 2 days in seconds
16459
16460import mysql.connector
16461
16462def connect_to_db():
16463 try:
16464 connection = mysql.connector.connect(**DB_CONFIG)
16465 return connection
16466 except mysql.connector.Error as err:
16467 print("Error in connect_to_db while creating MySQL database connection:", err)
16468
16469# define the APIKeyHeader for API authorization to our multi-modal endpoints
16470api_key_header = APIKeyHeader(name=API_KEY_NAME, auto_error=False)
16471focus_api_key_header = APIKeyHeader(name="FOCUS_API_KEY", auto_error=False)
16472
16473security = HTTPBasic()
16474imagebind_loader = ImageBindLoader()
16475
16476focus_scoring_service = FocusScoringService()
16477
16478print("SENTRY_DSN:", SENTRY_DSN)
16479sentry_sdk.init(
16480 dsn=SENTRY_DSN,
16481 traces_sample_rate=1.0,
16482 profiles_sample_rate=1.0,
16483)
16484
16485# region Utility functions for OMEGA Metadata Dashboard
16486def get_timestamp_from_filename(filename: str):
16487 return ulid.from_str(os.path.splitext(filename.split("/")[-1])[0]).timestamp().timestamp
16488
16489def pull_and_cache_dataset() -> List[str]:
16490 # Get the list of files in the dataset repository
16491 omega_ds_files = huggingface_hub.repo_info(repo_id=HF_DATASET, repo_type="dataset").siblings
16492
16493 # Filter files that match the DATA_FILES_PREFIX
16494 recent_files = [
16495 f.rfilename
16496 for f in omega_ds_files if
16497 f.rfilename.startswith(DATA_FILES_PREFIX) and
16498 time.time() - get_timestamp_from_filename(f.rfilename) < MIN_AGE
16499 ][:MAX_FILES]
16500
16501 # Randomly sample up to MAX_FILES from the matching files
16502 sampled_files = random.sample(recent_files, min(MAX_FILES, len(recent_files)))
16503
16504 # Load the dataset using the sampled files
16505 video_metadata = []
16506 with TemporaryDirectory() as temp_dir:
16507 omega_dataset = load_dataset(HF_DATASET, data_files=sampled_files, cache_dir=temp_dir)["train"]
16508 for i, entry in enumerate(omega_dataset):
16509 metadata = []
16510 if "description" in entry and "description_embed" in entry:
16511 metadata.append(entry["video_id"])
16512 metadata.append(entry["youtube_id"])
16513 metadata.append(entry["start_time"])
16514 metadata.append(entry["end_time"])
16515 metadata.append(entry["description"])
16516 metadata.append(entry["description_relevance_score"])
16517 metadata.append(entry["query_relevance_score"])
16518 metadata.append(entry["query"])
16519 metadata.append(entry["submitted_at"])
16520 video_metadata.append(metadata)
16521
16522 # Cache the descriptions to a local file
16523 with open(CACHE_FILE, "w") as f:
16524 json.dump(video_metadata, f)
16525
16526 return True
16527# endregion Utility functions for OMEGA Metadata Dashboard
16528
16529async def get_api_key(api_key_header: str = Security(api_key_header)):
16530 if api_key_header in API_KEYS:
16531 return api_key_header
16532 else:
16533 raise HTTPException(
16534 status_code=401,
16535 detail="Invalid API Key"
16536 )
16537
16538async def get_focus_api_key(focus_api_key_header: str = Security(focus_api_key_header)):
16539 if focus_api_key_header in FOCUS_API_KEYS:
16540 return focus_api_key_header
16541 else:
16542 raise HTTPException(
16543 status_code=401,
16544 detail="Invalid API Key"
16545 )
16546
16547class VideoMetadataUpload(BaseModel):
16548 metadata: List[VideoMetadata]
16549 description_relevance_scores: List[float]
16550 query_relevance_scores: List[float]
16551 topic_query: str
16552 novelty_score: Optional[float] = None
16553 total_score: Optional[float] = None
16554 miner_hotkey: Optional[str] = None
16555
16556class AudioMetadataUpload(BaseModel):
16557 metadata: List[AudioMetadata]
16558 inverse_der: float
16559 audio_length_score: float
16560 audio_quality_total_score: float
16561 audio_query_score: float
16562 topic_query: str
16563 total_score: Optional[float] = None
16564 miner_hotkey: Optional[str] = None
16565
16566class FocusScoreResponse(BaseModel):
16567 video_id: str
16568 video_score: float
16569 video_details: dict
16570
16571class VideoPurchaseRevert(BaseModel):
16572 video_id: str
16573
16574def get_hotkey(credentials: Annotated[HTTPBasicCredentials, Depends(security)]) -> str:
16575 keypair = Keypair(ss58_address=credentials.username)
16576
16577 if keypair.verify(credentials.username, credentials.password):
16578 return credentials.username
16579
16580 raise HTTPException(
16581 status_code=status.HTTP_401_UNAUTHORIZED,
16582 detail="Signature mismatch",
16583 )
16584
16585def check_commune_validator_hotkey(hotkey: str, modules_keys):
16586 if hotkey not in modules_keys.values():
16587 print("Commune validator key not found")
16588 return False
16589 return True
16590
16591def authenticate_with_bittensor(hotkey, metagraph):
16592 if hotkey not in metagraph.hotkeys:
16593 return False
16594
16595 uid = metagraph.hotkeys.index(hotkey)
16596 if not metagraph.validator_permit[uid] and NETWORK != "test":
16597 print("Bittensor validator permit required")
16598 return False
16599
16600 if metagraph.S[uid] < 1000 and NETWORK != "test":
16601 print("Bittensor validator requires 1000+ staked TAO")
16602 return False
16603
16604 return True
16605
16606def authenticate_with_commune(hotkey, commune_keys):
16607 if ENABLE_COMMUNE and not check_commune_validator_hotkey(hotkey, commune_keys):
16608 return False
16609 return True
16610
16611def update_commune_keys(commune_client, commune_keys):
16612 try:
16613 return commune_client.query_map_key(COMMUNE_NETUID)
16614 except Exception as err:
16615 print("Error during commune keys update", str(err))
16616 return commune_keys
16617
16618async def run_focus_scoring(
16619 video_id: Annotated[str, Body()],
16620 focusing_task: Annotated[str, Body()],
16621 focusing_description: Annotated[str, Body()]
16622) -> Dict[str, Any]:
16623
16624 score_details = None
16625 embeddings = None
16626 try:
16627 score_details, embeddings = await focus_scoring_service.score_video(video_id, focusing_task, focusing_description)
16628 print(f"Score for focus video <{video_id}>: {score_details.final_score}")
16629 MIN_FINAL_SCORE = 0.1
16630 # todo: measure and tune these
16631 MIN_TASK_UNIQUENESS_SCORE = 0
16632 MIN_VIDEO_UNIQUENESS_SCORE = 0
16633 # get the db after scoring the video so it's not open for too long
16634 with get_db_context() as db:
16635 if score_details.final_score < MIN_FINAL_SCORE:
16636 rejection_reason = f"""This video got a score of {score_details.final_score * 100:.2f}%, which is lower than the minimum score of {MIN_FINAL_SCORE * 100}%.
16637Feedback from AI: {score_details.completion_score_breakdown.rationale}"""
16638 mark_video_rejected(
16639 db,
16640 video_id,
16641 rejection_reason,
16642 score_details=score_details,
16643 embeddings=embeddings
16644 )
16645 else:
16646 set_focus_video_score(db, video_id, score_details, embeddings)
16647 return { "success": True }
16648
16649 except Exception as e:
16650 exception_string = traceback.format_exc()
16651 error_string = f"{str(e)}\n{exception_string}"
16652 print(f"Error scoring focus video <{video_id}>: {error_string}")
16653 with get_db_context() as db:
16654 mark_video_rejected(
16655 db,
16656 video_id,
16657 "Task recording is not unique. If you believe this is an error, please contact a team member." if isinstance(e, VideoUniquenessError) else "Error scoring video",
16658 score_details=score_details,
16659 embeddings=embeddings,
16660 exception_string=exception_string,
16661 )
16662 return { "success": False, "error": error_string }
16663
16664async def main():
16665 app = FastAPI()
16666 # Mount the static directory to serve static files
16667 app.mount("/static", StaticFiles(directory="validator-api/static"), name="static")
16668
16669 subtensor = bittensor.subtensor(network=NETWORK)
16670 metagraph: bittensor.metagraph = subtensor.metagraph(NETUID)
16671
16672 commune_client = None
16673 commune_keys = None
16674 if ENABLE_COMMUNE:
16675 commune_client = CommuneClient(get_node_url(use_testnet=True if COMMUNE_NETWORK == "test" else False))
16676 commune_keys = update_commune_keys(commune_client, commune_keys)
16677
16678 async def resync_metagraph():
16679 while True:
16680 """Resyncs the metagraph and updates the hotkeys and moving averages based on the new metagraph."""
16681 print("resync_metagraph()")
16682
16683 try:
16684 # Sync the metagraph.
16685 print("syncing metagraph")
16686 metagraph.sync(subtensor=subtensor)
16687 print("metagraph synced")
16688
16689 # Sync latest commune keys
16690 if ENABLE_COMMUNE:
16691 commune_keys = update_commune_keys(commune_client, commune_keys)
16692 print("commune keys synced")
16693
16694 # In case of unforeseen errors, the api will log the error and continue operations.
16695 except Exception as err:
16696 print("Error during metagraph sync", str(err))
16697 print_exception(type(err), err, err.__traceback__)
16698
16699 await asyncio.sleep(90)
16700
16701 @app.on_event("shutdown")
16702 async def shutdown_event():
16703 print("Shutdown event fired, attempting dataset upload of current batch.")
16704 video_dataset_uploader.submit()
16705 audio_dataset_uploader.submit()
16706
16707 @app.get("/sentry-debug")
16708 async def trigger_error():
16709 division_by_zero = 1 / 0
16710
16711 @app.post("/api/get_pinecone_novelty")
16712 async def get_pinecone_novelty(
16713 metadata: List[VideoMetadata],
16714 hotkey: Annotated[str, Depends(get_hotkey)],
16715 ) -> List[float]:
16716 print("get_pinecone_novelty()")
16717
16718 if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16719 raise HTTPException(
16720 status_code=status.HTTP_403_FORBIDDEN,
16721 detail=f"Valid hotkey required.",
16722 )
16723
16724 uid = None
16725 if ENABLE_COMMUNE and hotkey in commune_keys.values():
16726 # get uid of commune validator
16727 for key_uid, key_hotkey in commune_keys.items():
16728 if key_hotkey == hotkey:
16729 uid = key_uid
16730 break
16731 validator_chain = "commune"
16732 elif uid is None and hotkey in metagraph.hotkeys:
16733 # get uid of bittensor validator
16734 uid = metagraph.hotkeys.index(hotkey)
16735 validator_chain = "bittensor"
16736
16737 start_time = time.time()
16738 # query the pinecone index to get novelty scores
16739 novelty_scores = await score.get_pinecone_novelty(metadata)
16740 print(f"Returning novelty scores={novelty_scores} for {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16741 return novelty_scores
16742
16743 @app.post("/api/upload_video_metadata")
16744 async def upload_video_metadata(
16745 upload_data: VideoMetadataUpload,
16746 hotkey: Annotated[str, Depends(get_hotkey)],
16747 ) -> bool:
16748 print("upload_video_metadata()")
16749 if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16750 raise HTTPException(
16751 status_code=status.HTTP_403_FORBIDDEN,
16752 detail=f"Valid hotkey required.",
16753 )
16754
16755 uid = None
16756 is_bittensor = 0
16757 is_commune = 0
16758 if ENABLE_COMMUNE and hotkey in commune_keys.values():
16759 # get uid of commune validator
16760 for key_uid, key_hotkey in commune_keys.items():
16761 if key_hotkey == hotkey:
16762 uid = key_uid
16763 break
16764 validator_chain = "commune"
16765 is_commune = 1
16766 elif uid is None and hotkey in metagraph.hotkeys:
16767 # get uid of bittensor validator
16768 uid = metagraph.hotkeys.index(hotkey)
16769 validator_chain = "bittensor"
16770 is_bittensor = 1
16771
16772 metadata = upload_data.metadata
16773 description_relevance_scores = upload_data.description_relevance_scores
16774 query_relevance_scores = upload_data.query_relevance_scores
16775 topic_query = upload_data.topic_query
16776
16777 start_time = time.time()
16778 video_ids = await score.upload_video_metadata(metadata, description_relevance_scores, query_relevance_scores, topic_query)
16779 print(f"Uploaded {len(video_ids)} video metadata from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16780
16781 if upload_data.miner_hotkey is not None:
16782 # Calculate and upsert leaderboard data
16783 datapoints = len(video_ids)
16784 avg_desc_relevance = sum(description_relevance_scores) / len(description_relevance_scores)
16785 avg_query_relevance = sum(query_relevance_scores) / len(query_relevance_scores)
16786 novelty_score = upload_data.novelty_score
16787 total_score = upload_data.total_score
16788 miner_hotkey = upload_data.miner_hotkey
16789
16790 try:
16791 start_time = time.time()
16792 connection = connect_to_db()
16793
16794 leaderboard_table_name = "miner_leaderboard"
16795 if not IS_PROD:
16796 leaderboard_table_name += "_test"
16797 query = f"""
16798 INSERT INTO {leaderboard_table_name} (
16799 hotkey,
16800 is_bittensor,
16801 is_commune,
16802 datapoints,
16803 avg_desc_relevance,
16804 avg_query_relevance,
16805 avg_novelty,
16806 avg_score,
16807 last_updated
16808 ) VALUES (
16809 %s, %s, %s, %s, %s, %s, %s, %s, NOW()
16810 ) ON DUPLICATE KEY UPDATE
16811 datapoints = datapoints + VALUES(datapoints),
16812 avg_desc_relevance = ((avg_desc_relevance * (datapoints - VALUES(datapoints))) + (VALUES(avg_desc_relevance) * VALUES(datapoints))) / datapoints,
16813 avg_query_relevance = ((avg_query_relevance * (datapoints - VALUES(datapoints))) + (VALUES(avg_query_relevance) * VALUES(datapoints))) / datapoints,
16814 avg_novelty = ((avg_novelty * (datapoints - VALUES(datapoints))) + (VALUES(avg_novelty) * VALUES(datapoints))) / datapoints,
16815 avg_score = ((avg_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_score) * VALUES(datapoints))) / datapoints,
16816 last_updated = NOW();
16817 """
16818 cursor = connection.cursor()
16819 cursor.execute(query, (
16820 miner_hotkey,
16821 is_bittensor,
16822 is_commune,
16823 datapoints,
16824 avg_desc_relevance,
16825 avg_query_relevance,
16826 novelty_score,
16827 total_score
16828 ))
16829 connection.commit()
16830 print(f"Upserted leaderboard data for {miner_hotkey} from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16831
16832 except mysql.connector.Error as err:
16833 raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
16834 finally:
16835 if connection:
16836 connection.close()
16837 else:
16838 print("Skipping leaderboard update because either non-production environment or vali running outdated code.")
16839
16840 return True
16841
16842
16843 @app.post("/api/upload_audio_metadata")
16844 async def upload_audio_metadata(
16845 upload_data: AudioMetadataUpload,
16846 hotkey: Annotated[str, Depends(get_hotkey)],
16847 ) -> bool:
16848 print("upload_audio_metadata()")
16849
16850 if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16851 raise HTTPException(
16852 status_code=status.HTTP_403_FORBIDDEN,
16853 detail=f"Valid hotkey required.",
16854 )
16855
16856 uid = None
16857 is_bittensor = 0
16858 is_commune = 0
16859 if ENABLE_COMMUNE and hotkey in commune_keys.values():
16860 # get uid of commune validator
16861 for key_uid, key_hotkey in commune_keys.items():
16862 if key_hotkey == hotkey:
16863 uid = key_uid
16864 break
16865 validator_chain = "commune"
16866 is_commune = 1
16867 elif uid is None and hotkey in metagraph.hotkeys:
16868 # get uid of bittensor validator
16869 uid = metagraph.hotkeys.index(hotkey)
16870 validator_chain = "bittensor"
16871 is_bittensor = 1
16872
16873 metadata = upload_data.metadata
16874 inverse_der = upload_data.inverse_der
16875 audio_length_score = upload_data.audio_length_score
16876 audio_quality_total_score = upload_data.audio_quality_total_score
16877 audio_query_score = upload_data.audio_query_score
16878 topic_query = upload_data.topic_query
16879 total_score = upload_data.total_score
16880
16881 start_time = time.time()
16882 audio_ids = await score.upload_audio_metadata(metadata, inverse_der, audio_length_score, audio_quality_total_score, audio_query_score, topic_query, total_score)
16883 print(f"Uploaded {len(audio_ids)} audio metadata from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16884
16885 if upload_data.miner_hotkey is not None:
16886 # Calculate and upsert leaderboard data
16887 datapoints = len(audio_ids)
16888 total_score = upload_data.total_score
16889 miner_hotkey = upload_data.miner_hotkey
16890
16891 try:
16892 start_time = time.time()
16893 connection = connect_to_db()
16894
16895 leaderboard_table_name = "miner_leaderboard_audio"
16896 if not IS_PROD:
16897 leaderboard_table_name += "_test"
16898 query = f"""
16899 INSERT INTO {leaderboard_table_name} (
16900 hotkey,
16901 is_bittensor,
16902 is_commune,
16903 datapoints,
16904 avg_der,
16905 avg_length_score,
16906 avg_quality_score,
16907 avg_query_score,
16908 avg_score,
16909 last_updated
16910 ) VALUES (
16911 %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW()
16912 ) ON DUPLICATE KEY UPDATE
16913 datapoints = datapoints + VALUES(datapoints),
16914 avg_der = ((avg_der * (datapoints - VALUES(datapoints))) + (VALUES(avg_der) * VALUES(datapoints))) / datapoints,
16915 avg_length_score = ((avg_length_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_length_score) * VALUES(datapoints))) / datapoints,
16916 avg_quality_score = ((avg_quality_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_quality_score) * VALUES(datapoints))) / datapoints,
16917 avg_query_score = ((avg_query_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_query_score) * VALUES(datapoints))) / datapoints,
16918 avg_score = ((avg_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_score) * VALUES(datapoints))) / datapoints,
16919 last_updated = NOW();
16920 """
16921 cursor = connection.cursor()
16922 cursor.execute(query, (
16923 miner_hotkey,
16924 is_bittensor,
16925 is_commune,
16926 datapoints,
16927 inverse_der,
16928 audio_length_score,
16929 audio_quality_total_score,
16930 audio_query_score,
16931 total_score
16932 ))
16933 connection.commit()
16934 print(f"Upserted leaderboard data for {miner_hotkey} from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16935
16936 except mysql.connector.Error as err:
16937 raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
16938 finally:
16939 if connection:
16940 connection.close()
16941 else:
16942 print("Skipping leaderboard update because either non-production environment or vali running outdated code.")
16943
16944 return True
16945
16946 @app.post("/api/get_proxy")
16947 async def get_proxy(
16948 hotkey: Annotated[str, Depends(get_hotkey)]
16949 ) -> str:
16950
16951 if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16952 raise HTTPException(
16953 status_code=status.HTTP_403_FORBIDDEN,
16954 detail=f"Valid hotkey required.",
16955 )
16956
16957 return random.choice(PROXY_LIST)
16958
16959 ################ START OMEGA FOCUS ENDPOINTS ################
16960 @app.post("/api/focus/get_focus_score")
16961 async def get_focus_score(
16962 api_key: str = Security(get_focus_api_key),
16963 video_id: Annotated[str, Body()] = None,
16964 focusing_task: Annotated[str, Body()] = None,
16965 focusing_description: Annotated[str, Body()] = None,
16966 background_tasks: BackgroundTasks = BackgroundTasks(),
16967 ) -> Dict[str, bool]:
16968 background_tasks.add_task(run_focus_scoring, video_id, focusing_task, focusing_description)
16969 return { "success": True }
16970
16971 @app.get("/api/focus/get_list")
16972 @limiter.limit("1000/minute")
16973 async def _get_available_focus_video_list(
16974 request: Request,
16975 db: Session=Depends(get_db)
16976 ):
16977 """
16978 Return all available focus videos
16979 """
16980 return await get_all_available_focus(db)
16981
16982 # FV TODO: let's do proper miner auth here instead, and then from the retrieved hotkey, we can also
16983 # retrieve the coldkey and use that to confirm the transfer
16984 @app.post("/api/focus/purchase")
16985 @limiter.limit("20/minute")
16986 async def purchase_video(
16987 request: Request,
16988 background_tasks: BackgroundTasks,
16989 video_id: Annotated[str, Body()],
16990 miner_hotkey: Annotated[str, Body()],
16991 ):
16992 if await already_purchased_max_focus_tao():
16993 print("Purchases in the last 24 hours have reached the max focus tao limit.")
16994 raise HTTPException(400, "Purchases in the last 24 hours have reached the max focus tao limit, please try again later.")
16995
16996 with get_db_context() as db:
16997 availability = await check_availability(db, video_id, miner_hotkey, True) # run with_lock True
16998 print('availability', availability)
16999 if availability['status'] == 'success':
17000 amount = availability['price']
17001 video_owner_coldkey = get_video_owner_coldkey(db, video_id) # run with_lock True
17002 background_tasks.add_task(confirm_video_purchased, video_id, True) # run with_lock True
17003 return {
17004 'status': 'success',
17005 'address': video_owner_coldkey,
17006 'amount': amount,
17007 }
17008 else:
17009 return availability
17010
17011 @app.post("/api/focus/revert-pending-purchase")
17012 @limiter.limit("100/minute")
17013 async def revert_pending_purchase(
17014 request: Request,
17015 video: VideoPurchaseRevert,
17016 db: Session=Depends(get_db),
17017 ):
17018 return mark_video_submitted(db, video.video_id, True) # run with_lock True
17019
17020 @app.post("/api/focus/verify-purchase")
17021 @limiter.limit("100/minute")
17022 async def verify_purchase(
17023 request: Request,
17024 miner_hotkey: Annotated[str, Body()],
17025 video_id: Annotated[str, Body()],
17026 block_hash: Annotated[str, Body()],
17027 db: Session=Depends(get_db),
17028 ):
17029 video_owner_coldkey = get_video_owner_coldkey(db, video_id) # run with_lock True
17030 result = await confirm_transfer(db, video_owner_coldkey, video_id, miner_hotkey, block_hash)
17031 if result:
17032 return {
17033 'status': 'success',
17034 'message': 'Video purchase verification was successful'
17035 }
17036 else:
17037 return {
17038 'status': 'error',
17039 'message': f'Video purchase verification failed for video_id {video_id} on block_hash {block_hash} by miner_hotkey {miner_hotkey}'
17040 }
17041
17042 @app.get('/api/focus/miner_purchase_scores/{miner_hotkey_list}')
17043 async def miner_purchase_scores(
17044 miner_hotkey_list: str,
17045 db: Session = Depends(get_db)
17046 ) -> Dict[str, MinerPurchaseStats]:
17047 return {
17048 hotkey: await get_miner_purchase_stats(db, hotkey)
17049 for hotkey in miner_hotkey_list.split(',')
17050 }
17051
17052 class TaskTypeMap(BaseModel):
17053 task_type_map: Dict[TaskType, float]
17054
17055 @app.get('/api/focus/get_task_percentage_map')
17056 def get_task_percentage_map():
17057 return TaskTypeMap(task_type_map=TASK_TYPE_MAP)
17058
17059 @app.get('/api/focus/get_rewards_percent')
17060 async def get_rewards_percent():
17061 return FOCUS_REWARDS_PERCENT
17062
17063 @app.get('/api/focus/get_max_focus_tao')
17064 async def _get_max_focus_tao() -> float:
17065 return await get_max_focus_tao()
17066
17067 @app.get('/api/focus/get_purchase_max_focus_tao')
17068 async def _get_purchase_max_focus_tao() -> float:
17069 return await get_purchase_max_focus_tao()
17070
17071 async def cache_max_focus_tao():
17072 while True:
17073 """Re-caches the value of max_focus_tao."""
17074 print("cache_max_focus_tao()")
17075
17076 max_attempts = 3
17077 attempt = 0
17078
17079 while attempt < max_attempts:
17080 try:
17081 max_focus_tao = await get_max_focus_tao()
17082 break # Exit the loop if the function succeeds
17083
17084 # In case of unforeseen errors, the api will log the error and continue operations.
17085 except Exception as err:
17086 attempt += 1
17087 print(f"Error during recaching of max_focus_tao (Attempt {attempt}/{max_attempts}):", str(err))
17088
17089 if attempt >= max_attempts:
17090 print("Max attempts reached. Skipping this caching this cycle.")
17091 break
17092
17093 # Sleep in seconds
17094 await asyncio.sleep(1800) # 30 minutes
17095 ################ END OMEGA FOCUS ENDPOINTS ################
17096
17097 """ TO BE DEPRECATED """
17098 @app.post("/api/validate")
17099 async def validate(
17100 videos: Videos,
17101 hotkey: Annotated[str, Depends(get_hotkey)],
17102 ) -> float:
17103 if not authenticate_with_bittensor(hotkey, metagraph):
17104 raise HTTPException(
17105 status_code=status.HTTP_403_FORBIDDEN,
17106 detail=f"Valid hotkey required.",
17107 )
17108 uid = metagraph.hotkeys.index(hotkey)
17109
17110 start_time = time.time()
17111
17112 youtube_rewards = await score.score_and_upload_videos(videos, await imagebind_loader.get_imagebind())
17113
17114 if youtube_rewards is None:
17115 print("YouTube rewards are empty, returning None")
17116 return None
17117
17118 total_rewards: float = youtube_rewards
17119
17120 print(f"Total Rewards: {total_rewards}")
17121 print(f"Returning score={total_rewards} for validator={uid} in {time.time() - start_time:.2f}s")
17122
17123 return total_rewards
17124
17125 if not IS_PROD:
17126 @app.get("/api/count_unique")
17127 async def count_unique(
17128 videos: Videos,
17129 ) -> str:
17130 nunique = await score.get_num_unique_videos(videos)
17131 return f"{nunique} out of {len(videos.video_metadata)} submitted videos are unique"
17132
17133 @app.get("/api/check_score")
17134 async def check_score(
17135 videos: Videos,
17136 ) -> dict:
17137 detailed_score = await score.score_videos_for_testing(videos, await imagebind_loader.get_imagebind())
17138 return detailed_score
17139
17140 @app.get("/api/topic")
17141 async def get_topic() -> str:
17142 return random.choice(TOPICS_LIST)
17143
17144 @app.get("/api/topics")
17145 async def get_topics() -> List[str]:
17146 return TOPICS_LIST
17147
17148 @app.get("/")
17149 def healthcheck():
17150 return datetime.utcnow()
17151
17152 ################ START MULTI-MODAL API / OPENTENSOR CONNECTOR ################
17153 @app.get("/api/mm/topics")
17154 async def get_mm_topics(api_key: str = Security(get_api_key)):
17155 try:
17156 connection = connect_to_db()
17157 query = f"SELECT DISTINCT query FROM omega_multimodal"
17158 cursor = connection.cursor()
17159 cursor.execute(query)
17160 data = [row[0] for row in cursor.fetchall()]
17161
17162 cursor.close()
17163 connection.close()
17164 return data
17165 except mysql.connector.Error as err:
17166 raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17167
17168
17169 @app.get("/api/mm/topic_video_count")
17170 async def get_mm_topic_video_count(api_key: str = Security(get_api_key)):
17171 try:
17172 connection = connect_to_db()
17173 query = f"SELECT query, COUNT(*) AS num_videos FROM omega_multimodal GROUP BY query"
17174 cursor = connection.cursor(dictionary=True)
17175 cursor.execute(query)
17176 data = cursor.fetchall()
17177
17178 cursor.close()
17179 connection.close()
17180 return data
17181 except mysql.connector.Error as err:
17182 raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17183
17184
17185 @app.get("/api/mm/topic_relevant/{topic}")
17186 async def get_mm_topic_relevant(api_key: str = Security(get_api_key), topic: str = Path(...)):
17187 try:
17188 connection = connect_to_db()
17189 query = f"SELECT video_id, youtube_id, description, start_time, end_time FROM omega_multimodal where query = '{topic}' ORDER BY query_relevance_score DESC LIMIT 100"
17190 cursor = connection.cursor(dictionary=True)
17191 cursor.execute(query)
17192 data = cursor.fetchall()
17193
17194 cursor.close()
17195 connection.close()
17196 return data
17197 except mysql.connector.Error as err:
17198 raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17199 ################ END MULTI-MODAL API / OPENTENSOR CONNECTOR ################
17200
17201 ################ START LEADERBOARD ################
17202 @app.get("/api/leaderboard")
17203 async def get_leaderboard_data(hotkey: Optional[str] = None, sort_by: Optional[str] = None, sort_order: Optional[str] = None):
17204 try:
17205 leaderboard_table_name = "miner_leaderboard"
17206 if not IS_PROD:
17207 leaderboard_table_name += "_test"
17208 connection = connect_to_db()
17209 query = f"SELECT * FROM {leaderboard_table_name}"
17210 params = []
17211
17212 # Filter by hotkey if provided
17213 if hotkey:
17214 query += " WHERE hotkey = %s"
17215 params.append(hotkey)
17216
17217 # Sort by the specified column if provided, default to 'datapoints'
17218 sort_column = "datapoints" # Default sort column
17219 sort_order = "DESC" # Default sort order
17220 if sort_by:
17221 # Validate and map sort_by to actual column names if necessary
17222 valid_sort_columns = {
17223 "datapoints": "datapoints",
17224 "avg_desc_relevance": "avg_desc_relevance",
17225 "avg_query_relevance": "avg_query_relevance",
17226 "avg_novelty": "avg_novelty",
17227 "avg_score": "avg_score",
17228 "last_updated": "last_updated"
17229 }
17230 sort_column = valid_sort_columns.get(sort_by, sort_column)
17231 if sort_order:
17232 # Validate and map sort_order to actual values if necessary
17233 valid_sort_orders = {
17234 "asc": "ASC",
17235 "desc": "DESC"
17236 }
17237 sort_order = valid_sort_orders.get(sort_order.lower(), sort_order)
17238
17239 query += f" ORDER BY {sort_column} {sort_order}"
17240
17241 cursor = connection.cursor(dictionary=True)
17242 cursor.execute(query, params)
17243 data = cursor.fetchall()
17244
17245 cursor.close()
17246 connection.close()
17247 return data
17248 except mysql.connector.Error as err:
17249 raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17250
17251 @app.get("/leaderboard")
17252 def leaderboard():
17253 return FileResponse('./validator-api/static/leaderboard.html')
17254
17255 @app.get("/api/leaderboard-dataset-data")
17256 async def get_leaderboard_dataset_data():
17257 try:
17258 connection = connect_to_db()
17259 query = "SELECT * FROM hf_dataset_snapshots ORDER BY snapshot_date ASC"
17260 cursor = connection.cursor(dictionary=True)
17261 cursor.execute(query)
17262 data = cursor.fetchall()
17263
17264 cursor.close()
17265 connection.close()
17266 return data
17267 except mysql.connector.Error as err:
17268 raise HTTPException(status_code=500, detail=f"Error fetching leaderboard dataset data from MySQL database: {err}")
17269
17270 @app.get("/api/leaderboard-miner-data")
17271 async def get_leaderboard_miner_data(hotkey: Optional[str] = None):
17272 try:
17273 connection = connect_to_db()
17274 params = []
17275
17276 query = "SELECT * FROM miner_leaderboard_snapshots wHERE 1=1"
17277
17278 # Filter by hotkey if provided
17279 if hotkey:
17280 query += " AND hotkey = %s"
17281 params.append(hotkey)
17282
17283 query += " ORDER BY snapshot_date ASC"
17284
17285 cursor = connection.cursor(dictionary=True)
17286 cursor.execute(query, params)
17287 data = cursor.fetchall()
17288
17289 cursor.close()
17290 connection.close()
17291 return data
17292 except mysql.connector.Error as err:
17293 raise HTTPException(status_code=500, detail=f"Error fetching leaderboard miner data from MySQL database: {err}")
17294
17295 @app.get("/api/leaderboard-focus-data")
17296 async def get_leaderboard_focus_data():
17297 try:
17298 connection = connect_to_db()
17299 query = "SELECT * FROM focus_kpi_snapshots ORDER BY snapshot_date ASC"
17300 cursor = connection.cursor(dictionary=True)
17301 cursor.execute(query)
17302 data = cursor.fetchall()
17303
17304 cursor.close()
17305 connection.close()
17306 return data
17307 except mysql.connector.Error as err:
17308 raise HTTPException(status_code=500, detail=f"Error fetching focus kpi data from MySQL database: {err}")
17309 ################ END LEADERBOARD ################
17310
17311 ################ START DASHBOARD ################
17312 async def resync_dataset():
17313 while True:
17314 """Resyncs the dataset by updating our JSON data source from the huggingface dataset."""
17315 print("resync_dataset()")
17316
17317 max_attempts = 3
17318 attempt = 0
17319
17320 while attempt < max_attempts:
17321 try:
17322 pull_and_cache_dataset()
17323 break # Exit the loop if the function succeeds
17324
17325 # In case of unforeseen errors, the api will log the error and continue operations.
17326 except Exception as err:
17327 attempt += 1
17328 print(f"Error during dataset sync (Attempt {attempt}/{max_attempts}):", str(err))
17329 #print_exception(type(err), err, err.__traceback__)
17330
17331 if attempt >= max_attempts:
17332 print("Max attempts reached. Skipping this sync cycle.")
17333 break
17334
17335 # Sleep in seconds
17336 await asyncio.sleep(1800) # 30 minutes
17337
17338 @app.get("/dashboard/get-video-metadata")
17339 async def get_video_metadata(
17340 sort_by: Optional[str] = "submitted_at",
17341 sort_order: Optional[str] = "desc",
17342 page: Optional[int] = 1,
17343 items_per_page: Optional[int] = 50
17344 ):
17345 print("get_video_metadata()")
17346 if os.path.exists(CACHE_FILE):
17347 with open(CACHE_FILE, "r") as f:
17348 descriptions = json.load(f)
17349
17350 # Define a mapping from sort_by parameter to the index in the metadata list
17351 sort_index_mapping = {
17352 "video_id": 0,
17353 "youtube_id": 1,
17354 "start_time": 2,
17355 "end_time": 3,
17356 "description": 4,
17357 "description_relevance_score": 5,
17358 "query_relevance_score": 6,
17359 "query": 7,
17360 "submitted_at": 8
17361 }
17362
17363 if sort_by and sort_by in sort_index_mapping:
17364 index = sort_index_mapping[sort_by]
17365 reverse = sort_order == "desc"
17366 descriptions.sort(key=lambda x: x[index], reverse=reverse)
17367
17368 # Pagination logic
17369 total_items = len(descriptions)
17370 start = (page - 1) * items_per_page
17371 end = start + items_per_page
17372 paginated_descriptions = descriptions[start:end]
17373
17374 for video in paginated_descriptions:
17375 video[0] = ".." + str(video[0])[:6]
17376 video[5] = round(video[5], 4) # Round description_relevance_score
17377 video[6] = round(video[6], 4) # Round query_relevance_score
17378 date_time = datetime.fromtimestamp(video[8])
17379 video[8] = date_time.strftime('%Y-%m-%d %H:%M:%S') # Format submitted_at
17380
17381 return {
17382 "total_items": total_items,
17383 "page": page,
17384 "items_per_page": items_per_page,
17385 "data": paginated_descriptions
17386 }
17387 else:
17388 return {"error": "Cache file not found"}
17389
17390 @app.get("/dashboard")
17391 def dashboard():
17392 print("dashboard()")
17393 return FileResponse('validator-api/static/dashboard.html')
17394 ################ END DASHBOARD ################
17395
17396 async def run_server():
17397 print("run_server()")
17398 config = uvicorn.Config(app=app, host="0.0.0.0", port=8001)
17399 server = uvicorn.Server(config)
17400 await server.serve()
17401
17402 server_task = asyncio.create_task(run_server())
17403 try:
17404 # Wait for the server to start
17405 tasks_list = [
17406 server_task,
17407 resync_metagraph(),
17408 cache_max_focus_tao(),
17409 ]
17410 if IS_PROD:
17411 tasks_list.append(resync_dataset())
17412 await asyncio.gather(*tasks_list)
17413 except asyncio.CancelledError:
17414 server_task.cancel()
17415 await server_task
17416
17417if __name__ == "__main__":
17418 asyncio.run(main())
17419
17420
17421
17422---
17423File: /validator-api/clear_index.py
17424---
17425
17426from validator_api import config
17427from pinecone import Pinecone
17428
17429PINECONE_INDEX = Pinecone(api_key=config.PINECONE_API_KEY).Index(config.PINECONE_INDEX)
17430PINECONE_INDEX.delete(delete_all=True)
17431
17432
17433
17434---
17435File: /validator-api/test_search_and_submit.py
17436---
17437
17438from omega.miner_utils import search_and_embed_youtube_videos, ImageBind, video_utils
17439from omega.protocol import Videos
17440from validator_api.dataset_upload import dataset_uploader
17441from validator_api.score import score_and_upload_videos
17442import asyncio
17443import time
17444
17445imagebind = ImageBind()
17446start = time.time()
17447query = "minecraft gameplay footage"
17448num_videos = 1
17449video_metadata_list = search_and_embed_youtube_videos(query, num_videos, imagebind)
17450print(f"Search and embed took {time.time() - start} seconds")
17451
17452videos = Videos(
17453 query=query,
17454 num_videos=num_videos,
17455 video_metadata=video_metadata_list,
17456)
17457
17458# dataset_uploader.min_batch_size = 2 # override to force upload
17459# dataset_uploader.desired_batch_size = 2 # override to force upload
17460# print(asyncio.run(score_and_upload_videos(videos, imagebind)))
17461
17462
17463
17464---
17465File: /auto_updating_validator.sh
17466---
17467
17468#!/bin/bash
17469
17470VALIDATOR_ARGS=$@
17471
17472# first, git pull
17473git pull
17474
17475# next, set up environment
17476pip install -e .
17477
17478# finally, run the validator
17479python neurons/validator.py $VALIDATOR_ARGS --neuron.auto_update
17480
17481
17482
17483---
17484File: /purchase_focus_video.py
17485---
17486
17487"""
17488Using the OMEGA Focus Video Purchase System:
17489
174901. Setup:
17491 - Ensure you have the latest required libraries installed. See requirements.txt.
17492 - Make sure you have your SN24 Bittensor wallet set up. You *MUST* use your SN24 registered wallet to purchase videos.
17493
174942. Running the Script:
17495 - Open a terminal and navigate to the directory containing the script.
17496 - Run the script with: `python purchase_focus_video.py`
17497
174983. Main Menu Options:
17499 When you run the script, you'll see a menu with 5 options:
17500
17501 1. View Focus Videos
17502 2. Purchase Focus Video
17503 3. Verify Purchase
17504 4. Display Order History
17505 5. Exit
17506
175074. Using the Options:
17508
17509 Option 1: View Focus Videos
17510 - Displays a list of available focus videos with details like Video ID, Score, Cost, and Expected Reward.
17511 - The displayed cost is the amount of TAO tokens required to purchase the video.
17512 - The expected reward is the amount of TAO tokens you'll earn from SN24 emissions for purchasing the video.
17513 - Select a number from the list next to the video you want to purchase.
17514
17515 Option 2: Purchase Focus Video
17516 - Allows you to purchase a video by entering its ID.
17517 - You'll need to provide your wallet information (name, hotkey, path).
17518 - The script will initiate a transfer of TAO tokens to the OMEGA Focus App user who created the video. This secures the purchase of the video.
17519 - After the transfer is complete, the script will attempt to verify the purchase.
17520 - Once successful, you're all set! SN24 validators will automatically detect your purchase and reward your expected TAO emissions.
17521
17522 Option 3: Verify Purchase
17523 - This option is used when there are issues with the purchase verification during the purchase process.
17524 - If you've successfully transferred the TAO tokens but the purchase wasn't verified, you can use this option to verify the purchase.
17525 - You'll need to provide the Video ID, Miner Hotkey, and Block Hash.
17526
17527 Option 4: Display Order History
17528 - Shows a list of your previous purchases and their current status.
17529
17530 Option 5: Exit
17531 - Closes the application.
17532
175335. Important Notes:
17534 - The script can be ran using Bittensor mainnet or testnet based on the SUBTENSOR_NETWORK variable. Set it to "test" for testnet. Set to None for mainnet.
17535 - Purchases are saved locally in '~/.omega/focus_videos.json'.
17536 - Always ensure you have sufficient TAO tokens in your wallet before making a purchase.
17537 - Once a purchase has been verified successful, SN24 validators will automatically detect your purchase and reward your expected TAO emissions.
17538
175396. Wallet Information:
17540 - When purchasing, you'll need to provide your Bittensor wallet details.
17541 - You *MUST* use your SN24 registered wallet to purchase videos.
17542 - The default wallet path is '~/.bittensor/wallets/'.
17543
17544Remember to keep your wallet information secure and never share your private keys.
17545"""
17546
17547import os
17548import requests
17549import bittensor as bt
17550from bittensor import wallet as btcli_wallet
17551import argparse
17552import time
17553import json
17554from tabulate import tabulate
17555from datetime import datetime
17556import multiprocessing
17557import sys
17558
17559parser = argparse.ArgumentParser(description='Interact with the OMEGA Focus Videos API.')
17560args = parser.parse_args()
17561
17562SUBTENSOR_NETWORK = None # "test" or None
17563
17564API_BASE = (
17565 "https://dev-validator.api.omega-labs.ai"
17566 if SUBTENSOR_NETWORK == "test" else
17567 "https://validator.api.omega-labs.ai"
17568)
17569
17570CYAN = "\033[96m"
17571GREEN = "\033[92m"
17572RED = "\033[91m"
17573RESET = "\033[0m"
17574
17575def initialize_subtensor():
17576 try:
17577 subtensor = bt.subtensor(network=SUBTENSOR_NETWORK)
17578 #print(f"{GREEN}Subtensor initialized successfully.{RESET}")
17579 return subtensor
17580 except Exception as e:
17581 print(f"{RED}Error initializing subtensor: {str(e)}{RESET}")
17582 raise
17583
17584def list_videos():
17585 videos_response = requests.get(
17586 API_BASE + "/api/focus/get_list",
17587 headers={"Content-Type": "application/json"},
17588 timeout=30
17589 )
17590
17591 if videos_response.status_code != 200:
17592 print(f"{RED}Error fetching focus videos: {videos_response.status_code}{RESET}")
17593 return None
17594
17595 videos_data = videos_response.json()
17596 return videos_data
17597
17598def display_videos(videos_data):
17599 if not videos_data or len(videos_data) == 0:
17600 print(f"\n{RED}No videos available.{RESET}")
17601 return
17602
17603 print(f"\n{CYAN}Available Focus Videos:{RESET}")
17604
17605 # Prepare the data for tabulate
17606 table_data = []
17607 for idx, video in enumerate(videos_data, 1):
17608 # Convert created_at to a more readable format
17609 created_at = datetime.fromisoformat(video['created_at'].replace('Z', '+00:00'))
17610 formatted_date = created_at.strftime("%Y-%m-%d %H:%M:%S")
17611
17612 table_data.append([
17613 idx,
17614 video['video_id'],
17615 f"{video['video_score']:.3f}",
17616 f"{video['expected_reward_tao']:.5f}",
17617 f"{float(video['expected_reward_tao']) / 0.9:.5f}",
17618 #formatted_date
17619 ])
17620
17621 # Create the table
17622 headers = ["#", "Video ID", "Score", "Cost (TAO)", "Expected Reward (TAO)"]
17623 table = tabulate(table_data, headers=headers, tablefmt="pretty")
17624
17625 print(table)
17626
17627
17628class TransferTimeout(Exception):
17629 pass
17630
17631def reset_terminal():
17632 # Try multiple methods to reset the terminal
17633 os.system('stty sane')
17634 os.system('reset')
17635 sys.stdout.write('\033[0m')
17636 sys.stdout.flush()
17637
17638def transfer_operation(wallet, transfer_address_to, transfer_balance, result_queue):
17639 try:
17640 subtensor = initialize_subtensor()
17641 success, block_hash, err_msg = subtensor._do_transfer(
17642 wallet,
17643 transfer_address_to,
17644 transfer_balance,
17645 wait_for_finalization=True,
17646 wait_for_inclusion=True,
17647 )
17648 result_queue.put((success, block_hash, err_msg))
17649 except Exception as e:
17650 result_queue.put((False, None, str(e)))
17651
17652def transfer_with_timeout(wallet, transfer_address_to, transfer_balance):
17653 result_queue = multiprocessing.Queue()
17654
17655 transfer_process = multiprocessing.Process(
17656 target=transfer_operation,
17657 args=(wallet, transfer_address_to, transfer_balance, result_queue)
17658 )
17659
17660 transfer_process.start()
17661 transfer_process.join(timeout=150) # 2m 30s = 150 seconds
17662
17663 if transfer_process.is_alive():
17664 transfer_process.terminate()
17665 transfer_process.join()
17666 reset_terminal()
17667 print("\nTransfer operation timed out after 2 minutes 30 seconds. Exiting.")
17668
17669 if not result_queue.empty():
17670 return result_queue.get()
17671 else:
17672 return False, None, "Transfer process exited without result"
17673
17674def purchase_video(video_id=None, wallet_name=None, wallet_hotkey=None, wallet_path=None):
17675 if not video_id:
17676 video_id = input(f"{CYAN}Enter focus video id: {RESET}")
17677
17678 if wallet_name is not None:
17679 name = wallet_name
17680 else:
17681 name = input(f"{CYAN}Enter wallet name (default: Coldkey): {RESET}") or "Coldkey"
17682 if wallet_hotkey is not None:
17683 hotkey_name = wallet_hotkey
17684 else:
17685 hotkey_name = input(f"{CYAN}Enter wallet hotkey name (default: Hotkey): {RESET}") or "Hotkey"
17686 if wallet_path is not None:
17687 path = wallet_path
17688 else:
17689 path = input(f"{CYAN}Enter wallet path (default: ~/.bittensor/wallets/): {RESET}") or "~/.bittensor/wallets/"
17690
17691 wallet = btcli_wallet(name=name, hotkey=hotkey_name, path=path)
17692 try:
17693 hotkey = wallet.get_hotkey()
17694 except Exception as e:
17695 print(f"{RED}Error loading hotkey: {e} {RESET}")
17696 return
17697
17698 miner_hotkey = hotkey.ss58_address
17699
17700 print(f"Purchasing video {video_id}...")
17701 print(f"{RED}You will only have 2 minutes and 30 seconds to complete the transfer of TAO tokens, otherwise the purchase will be reverted.{RESET}")
17702 purchase_response = requests.post(
17703 API_BASE + "/api/focus/purchase",
17704 json={"video_id": video_id, "miner_hotkey": miner_hotkey},
17705 headers={"Content-Type": "application/json"},
17706 timeout=60
17707 )
17708
17709 purchase_data = purchase_response.json()
17710 if purchase_response.status_code != 200:
17711 print(f"{RED}Error purchasing video {video_id}: {purchase_response.status_code}{RESET}")
17712 if "detail" in purchase_data:
17713 print(f"{RED}Details: {purchase_data['detail']}{RESET}")
17714 return
17715
17716 if "status" in purchase_data and purchase_data["status"] == "error":
17717 print(f"{RED}Error purchasing video {video_id}: {purchase_data['message']}{RESET}")
17718 return
17719
17720 try:
17721 transfer_address_to = purchase_data["address"]
17722 transfer_amount = purchase_data["amount"]
17723
17724 print(f"Initiating transfer of {transfer_amount} TAO for video {video_id}...")
17725
17726 transfer_balance = bt.Balance.from_tao(transfer_amount)
17727
17728
17729 try:
17730 success, block_hash, err_msg = transfer_with_timeout(wallet, transfer_address_to, transfer_balance)
17731 except TransferTimeout:
17732 print(f"\n{RED}Transfer operation timed out after 2 minutes and 30 seconds. Aborting purchase.{RESET}")
17733 reset_terminal()
17734 revert_pending_purchase(video_id)
17735 repurchase_input(video_id, name, hotkey_name, path)
17736 return
17737
17738 """
17739 success, block_hash, err_msg = subtensor._do_transfer(
17740 wallet,
17741 transfer_address_to,
17742 transfer_balance,
17743 wait_for_finalization=True,
17744 wait_for_inclusion=True,
17745 )
17746 """
17747
17748 if success:
17749 print(f"{GREEN}Transfer finalized. Block Hash: {block_hash}{RESET}")
17750 save_purchase_info(video_id, miner_hotkey, block_hash, "purchased", transfer_amount)
17751 verify_result = verify_purchase(video_id, miner_hotkey, block_hash)
17752 if not verify_result:
17753 print(f"{RED}There was an error verifying your purchase after successfully transferring TAO. Please try the 'Verify Purchase' option immediately and contact an admin if you are unable to successfully verify.{RESET}")
17754 else:
17755 print(f"{RED}Failed to complete transfer for video {video_id}.{RESET}")
17756 revert_pending_purchase(video_id)
17757 repurchase_input(video_id, name, hotkey_name, path)
17758
17759 except Exception as e:
17760 print(f"{RED}Error transferring TAO tokens: {str(e)}{RESET}")
17761 if "EOF occurred in violation of protocol" in str(e):
17762 print(f"{RED}Subtensor connection error detected. Re-initializing subtensor.{RESET}")
17763 initialize_subtensor()
17764 revert_pending_purchase(video_id)
17765 repurchase_input(video_id, name, hotkey_name, path)
17766
17767def revert_pending_purchase(video_id):
17768 print(f"Reverting Pending Purchasing of video {video_id}...")
17769 revert_response = requests.post(
17770 API_BASE + "/api/focus/revert-pending-purchase",
17771 json={"video_id": video_id},
17772 headers={"Content-Type": "application/json"},
17773 timeout=60
17774 )
17775 if revert_response.status_code != 200:
17776 print(f"{RED}Error reverting pending purchase of video {video_id}: {revert_response.status_code}{RESET}")
17777 return
17778 if revert_response.status_code == 200:
17779 print(f"{GREEN}Pending purchase of video {video_id} reverted successfully.{RESET}")
17780 return
17781
17782def repurchase_input(video_id, wallet_name=None, wallet_hotkey=None, wallet_path=None):
17783 repurchase = input(f"{CYAN}Do you want to repurchase video {video_id}? (y/n): {RESET}").lower()
17784 if repurchase == 'y':
17785 purchase_video(video_id, wallet_name, wallet_hotkey, wallet_path)
17786 elif repurchase != 'n':
17787 print(f"{RED}Invalid input. Please enter 'y' or 'n'.{RESET}")
17788 repurchase_input(video_id, wallet_name, wallet_hotkey, wallet_path)
17789
17790def display_saved_orders(for_verification=False):
17791 purchases_file = os.path.expanduser("~/.omega/focus_videos.json")
17792 if not os.path.exists(purchases_file):
17793 print(f"{RED}No saved orders found.{RESET}")
17794 return None
17795
17796 with open(purchases_file, 'r') as f:
17797 purchases = json.load(f)
17798
17799 if not purchases:
17800 print(f"{RED}No saved orders found.{RESET}")
17801 return None
17802
17803 purchases.sort(key=lambda x: x.get('created_at', ''), reverse=True)
17804
17805 print(f"\n{CYAN}Saved Orders:{RESET}")
17806
17807 table_data = []
17808 for idx, purchase in enumerate(purchases, 1):
17809 created_at = purchase.get('created_at', 'N/A')
17810 if created_at != 'N/A':
17811 created_at = datetime.fromisoformat(created_at.replace('Z', '+00:00')).strftime("%Y-%m-%d %H:%M:%S")
17812
17813 table_data.append([
17814 idx,
17815 purchase['video_id'],
17816 purchase['state'],
17817 purchase.get('amount', 'N/A'),
17818 f"{float(purchase.get('amount', 0)) / 0.9:.5f}",
17819 purchase.get('miner_hotkey', 'N/A')[:5] + '...' + purchase.get('miner_hotkey', 'N/A')[-5:],
17820 purchase['block_hash'][:5] + '...' + purchase['block_hash'][-5:],
17821 created_at
17822 ])
17823
17824 headers = ["#", "Video ID", "Purchase State", "Cost (TAO)", "Estimated Reward (TAO)", "Purchasing Hotkey", "Block Hash", "Purchase Date"]
17825 table = tabulate(table_data, headers=headers, tablefmt="pretty")
17826
17827 print(table)
17828 return purchases
17829
17830def select_order_for_verification():
17831 purchases = display_saved_orders()
17832
17833 while True:
17834 if purchases:
17835 print(f"*** NOTE: A purchase is finalized when the purchase state is 'verified'. ***")
17836 choice = input(f"{CYAN}Enter the number of the order to verify, 'm' for manual input, or 'n' to cancel: {RESET}").lower()
17837 else:
17838 choice = 'm'
17839
17840 if choice == 'n':
17841 return None, None, None
17842 elif choice == 'm':
17843 video_id = input(f"{CYAN}Enter video ID: {RESET}")
17844 miner_hotkey = input(f"{CYAN}Enter miner hotkey: {RESET}")
17845 block_hash = input(f"{CYAN}Enter block hash: {RESET}")
17846 return video_id, miner_hotkey, block_hash
17847 elif choice.isdigit():
17848 idx = int(choice) - 1
17849 if 0 <= idx < len(purchases):
17850 selected = purchases[idx]
17851 return selected['video_id'], selected.get('miner_hotkey', ''), selected['block_hash']
17852 else:
17853 print(f"{RED}Invalid selection. Please try again.{RESET}")
17854 else:
17855 print(f"{RED}Invalid input. Please try again.{RESET}")
17856
17857def select_order_for_full_display(purchases):
17858 while True:
17859 choice = input(f"{CYAN}Enter the number of the order to see full details, or 'n' to return to menu: {RESET}").lower()
17860
17861 if choice == 'n':
17862 return
17863 elif choice.isdigit():
17864 idx = int(choice) - 1
17865 if 0 <= idx < len(purchases):
17866 selected = purchases[idx]
17867 # Display full details
17868 print(f"\n{CYAN}Order Details:{RESET}")
17869 print(f"Video ID: {selected['video_id']}")
17870 print(f"Purchase State: {selected['state']}")
17871 print(f"Cost (TAO): {selected.get('amount', 'N/A')}")
17872 print(f"Estimated Reward (TAO): {float(selected.get('amount', 0)) / 0.9:.5f}")
17873 print(f"Purchasing Hotkey: {selected.get('miner_hotkey', 'N/A')}")
17874 print(f"Block Hash: {selected['block_hash']}")
17875 print(f"Purchase Date: {selected.get('created_at', 'N/A')}")
17876 return
17877 else:
17878 print(f"{RED}Invalid selection. Please try again.{RESET}")
17879 else:
17880 print(f"{RED}Invalid input. Please try again.{RESET}")
17881
17882def verify_purchase(video_id=None, miner_hotkey=None, block_hash=None):
17883 if not all([video_id, miner_hotkey, block_hash]):
17884 video_id, miner_hotkey, block_hash = select_order_for_verification()
17885 if not all([video_id, miner_hotkey, block_hash]):
17886 print(f"{CYAN}Verification cancelled.{RESET}")
17887 return
17888
17889 print(f"Verifying purchase for video {video_id} on block hash {block_hash} ...")
17890
17891 retries = 3
17892 for attempt in range(retries):
17893 try:
17894 verify_response = requests.post(
17895 API_BASE + "/api/focus/verify-purchase",
17896 json={"miner_hotkey": miner_hotkey, "video_id": video_id, "block_hash": block_hash},
17897 headers={"Content-Type": "application/json"},
17898 timeout=90
17899 )
17900 print(f"Purchase verification response for video {video_id}:", verify_response.text)
17901 if verify_response.status_code == 200:
17902 print(f"{GREEN}Purchase verified successfully!{RESET}")
17903 save_purchase_info(video_id, miner_hotkey, block_hash, "verified")
17904 return True
17905
17906 if attempt < retries - 1:
17907 print(f"{CYAN}Attempt #{attempt + 1} to verify purchase failed. Retrying in 2 seconds...{RESET}")
17908 time.sleep(2)
17909 except Exception as e:
17910 if attempt < retries - 1:
17911 print(f"{CYAN}Attempt #{attempt + 1} to verify purchase failed. Retrying in 2 seconds...{RESET}")
17912 print(f"{RED}Error: {str(e)}{RESET}")
17913 time.sleep(2)
17914 else:
17915 print(f"{RED}All {retries} attempts failed. Unable to verify purchase.{RESET}")
17916 return False
17917
17918def save_purchase_info(video_id, hotkey, block_hash, state, amount=None):
17919 purchases_file = os.path.expanduser("~/.omega/focus_videos.json")
17920 os.makedirs(os.path.dirname(purchases_file), exist_ok=True)
17921
17922 purchases = []
17923 if os.path.exists(purchases_file):
17924 with open(purchases_file, 'r') as f:
17925 purchases = json.load(f)
17926
17927 # Check if the video_id already exists
17928 for purchase in purchases:
17929 if purchase['video_id'] == video_id:
17930 purchase['state'] = state
17931 purchase['miner_hotkey'] = hotkey
17932 purchase['block_hash'] = block_hash
17933 if amount is not None:
17934 purchase['amount'] = amount
17935 break
17936 else:
17937 # If the video_id doesn't exist, create a new entry
17938 new_purchase = {
17939 "video_id": video_id,
17940 "miner_hotkey": hotkey,
17941 "block_hash": block_hash,
17942 "state": state,
17943 "created_at": datetime.now().isoformat() # Add creation timestamp
17944 }
17945 if amount is not None:
17946 new_purchase['amount'] = amount
17947 purchases.append(new_purchase)
17948
17949 with open(purchases_file, 'w') as f:
17950 json.dump(purchases, f, indent=2)
17951
17952 print(f"{GREEN}Purchase information {'updated' if state == 'verified' else 'saved'} to {purchases_file}{RESET}")
17953
17954def main():
17955 while True:
17956 print(f"\n{CYAN}Welcome to the OMEGA Focus Videos Purchase System{RESET}")
17957 print("1. View + Purchase Focus Videos")
17958 print("2. Manually Purchase Focus Video")
17959 print("3. Verify Purchase")
17960 print("4. Display Order History")
17961 print("5. Exit")
17962
17963 choice = input(f"{CYAN}Enter your choice (1-5): {RESET}")
17964
17965 if choice == '1':
17966 videos_data = list_videos()
17967 if videos_data:
17968 display_videos(videos_data)
17969 purchase_option = input(f"\n{CYAN}Enter the number of the video you want to purchase or press 'n' to return to menu: {RESET}").lower()
17970 if purchase_option.isdigit():
17971 video_index = int(purchase_option) - 1
17972 if 0 <= video_index < len(videos_data):
17973 purchase_video(videos_data[video_index]['video_id'])
17974 else:
17975 print(f"{RED}Invalid video number.{RESET}")
17976 elif purchase_option != 'n':
17977 print(f"{RED}Invalid input. Returning to main menu.{RESET}")
17978 else:
17979 print(f"\n{RED}No videos available for purchase at this time.{RESET}")
17980 elif choice == '2':
17981 purchase_video()
17982 elif choice == '3':
17983 verify_purchase()
17984 elif choice == '4':
17985 purchases = display_saved_orders()
17986 select_order_for_full_display(purchases)
17987 elif choice == '5':
17988 print(f"{GREEN}Thank you for using the OMEGA Focus Videos Purchase System. Goodbye!{RESET}")
17989 break
17990 else:
17991 print(f"{RED}Invalid choice. Please try again.{RESET}")
17992
17993if __name__ == "__main__":
17994 try:
17995 multiprocessing.freeze_support()
17996 main()
17997 except KeyboardInterrupt:
17998 print("\nScript interrupted by user. Exiting.")
17999 reset_terminal()
18000 sys.exit(0)
18001 except Exception as e:
18002 print(f"\nAn unexpected error occurred: {str(e)}")
18003 reset_terminal()
18004 sys.exit(1)
18005
18006
18007
18008---
18009File: /README.md
18010---
18011
18012<div align="center">
18013
18014# OMEGA Labs Bittensor Subnet: The World's Largest Decentralized AGI Multimodal Dataset <!-- omit in toc -->
18015[](https://omegatron.ai)
18016[](https://opensource.org/licenses/MIT)
18017
18018---
18019
18020## Be, and it becomes ... <!-- omit in toc -->
18021
18022</div>
18023
18024---
18025- [Introduction](#introduction)
18026- [Key Features](#key-features)
18027- [Miner and Validator Functionality](#miner-and-validator-functionality)
18028 - [Miner](#miner)
18029 - [Validator](#validator)
18030- [Roadmap](#roadmap)
18031- [Running Miners and Validators](#running-miners-and-validators)
18032 - [Running a Miner](#running-a-miner)
18033 - [Running a Validator](#running-a-validator)
18034- [Contributing](#contributing)
18035- [License](#license)
18036
18037---
18038## Introduction
18039
18040Welcome to the OMEGA Labs Bittensor subnet, a groundbreaking initiative that aims to create the world's largest decentralized multimodal dataset for accelerating Artificial General Intelligence (AGI) research and development. Our mission is to democratize access to a vast and diverse dataset that captures the landscape of human knowledge and creation, empowering researchers and developers to push the boundaries of AGI.
18041
18042By harnessing the power of the Bittensor network and a global community of miners and validators, we are building a dataset that surpasses the scale and diversity of existing resources. With over 1 million hours of footage and 30 million+ 2-minute video clips, the OMEGA Labs dataset will enable the development of powerful AGI models and transform various industries.
18043
18044
18045## Key Features
18046
18047- 🌍 **Unparalleled Scale and Diversity**: 1 million+ hours of footage, 30 million+ video clips, covering 50+ scenarios and 15,000+ action phrases.
18048- 🧠 **Latent Representations**: Leveraging state-of-the-art models to translate video components into a unified latent space for efficient processing.
18049- 💰 **Incentivized Data Collection**: Rewarding miners for contributing high-quality, diverse, and novel videos through a decentralized network.
18050- 🤖 **Empowering Digital Agents**: Enabling the development of intelligent agents that can navigate complex workflows and assist users across platforms.
18051- 🎮 **Immersive Gaming Experiences**: Facilitating the creation of realistic gaming environments with rich physics and interactions.
18052
18053## Miner and Validator Functionality
18054
18055### Miner
18056
18057- Performs a simple search on YouTube and retrieves 8 videos at a time.
18058- Provides a certain clip range (maximum of 2 minutes) and a description (catch) which includes the title, tags, and description of the video.
18059- Obtains the ImageBind embeddings for the video, audio, and caption.
18060- Returns the video ID, caption, ImageBind embeddings (video, audio, caption embeddings), and start and end times for the clips (maximum of 2 minutes).
18061
18062### Validator
18063
18064- Takes the received videos from the miners and randomly selects one video for validation.
18065- Computes the ImageBind embeddings for all three modalities (video, audio, caption) of the selected video.
18066- Compares the quality of the embeddings to ensure they are consistent with the miner's submissions.
18067- If the selected video passes the validation, assumes all eight videos from the miner are valid.
18068- Scores the videos based on relevance, novelty, and detail richness:
18069 - Relevance: Calculated using cosine similarity between the topic embedding and each of the eight videos.
18070 - Novelty: For each video, finds the closest video in the Pinecone index and computes 1 - similarity.
18071 - Potential issue: Choosing the second most similar video instead of the most similar one.
18072 - Detail Richness: Determined by the cosine similarity between the text and video embeddings.
18073- Collects 1024 validated video entries and pushes them to Hugging Face as a file, which is then concatenated.
18074 - If a miner submits too frequently, the validator may increase the file threshold accumulation limit.
18075 - If the API needs to shut down for any reason, it will submit the remaining validated entries.
18076
18077## SN24: Ω Focus Videos Submission
18078
18079We're excited to introduce a new feature in the SN24 ecosystem: the Focus Video submission and reward process. This system creates a robust marketplace for task-completion videos, leveraging the strengths of the Bittensor network. Here's how it works:
18080
18081### The Players
180821. Ω Focus users: Individuals who complete tasks and record their work
180832. SN24 miners: Network participants who can purchase Focus videos
180843. SN24 validators: Entities that validate and score submissions
180854. Ω Brain: Ω Focus's backend API that processes submissions
18086
18087### The Process
18088
18089#### 1. Task Completion and Recording
18090Ω Focus users create tasks for themselves within the app. They then complete these tasks while screen recording their work via the app.
18091
18092#### 2. Submission and Initial Processing
18093Once a task is completed, the user's screen recording and task metadata are uploaded to Ω Brain. This backend system processes the recording, extracting metadata and combining partial clips if necessary.
18094
18095#### 3. Scoring
18096Ω Brain forwards the processed video to the SN24 validator API. The validator scores the submission based on predefined criteria. To learn more about the scoring algorithm, check out [this section](#scoring-algorithm) below.
18097
18098#### 4. User Notification and Marketplace Listing
18099The Ω Focus user receives their score and an estimate of the potential TAO reward. They can then choose to submit their video to the SN24 Focus Videos marketplace.
18100
18101#### 5. Miner Purchase
18102SN24 miners can browse and purchase videos from the marketplace. To make a purchase, a miner notifies the SN24 validator API of their intent. The API informs the miner of the TAO amount to transfer to the Ω Focus user's wallet. [Code here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/focus_app_v1_integration/purchase_focus_video.py)
18103
18104#### 6. Transaction Verification
18105Once the miner transfers the TAO, they provide the transaction's block hash to the SN24 validator API. The API then verifies this transaction on the Bittensor chain's public ledger. [Code here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/cron/confirm_purchase.py#L55)
18106
18107#### 7. Miner Scoring and Reimbursement
18108SN24 validators, while sending their YouTube scraping requests to miners, also check with the validator API to see if miners have purchased Focus Videos. Miners' scores are adjusted based on these purchases. Via validators increasing the miners' scores for purchasing videos from the marketplace, the Bittensor chain effectively then reimburses miners for their Focus Video purchases over the following 24-hour period. [Code here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/omega/base/validator.py#L322-L326)
18109
18110#### 8. Impact on Miner Scores
18111Focus Video scores currently make up 2.5% of a miner's total SN24 score. We plan to increase this percentage as the system proves successful.
18112
18113#### 9. Video Availability for External Buyers
18114Once a Focus Video submission is marked as COMPLETED (which happens when a miner transfers TAO to the Ω Focus user), the video becomes available for purchase by external data buyers, such as AI research labs. (Note: This feature will be implemented in the future.)
18115
18116### Benefits
18117- Users are incentivized to complete and record valuable tasks
18118- Miners can improve their scores by purchasing high-quality Focus Videos
18119- The network gains a new source of verified, high-quality data
18120- External entities will gain access to a marketplace of task-completion videos
18121
18122We believe this system will create a vibrant ecosystem within SN24, driving value for all participants while generating useful data for the broader AI community. We're starting with a conservative 2.5% score impact for Focus Videos, but we're excited to see how this new feature develops and grows within our network.
18123
18124```mermaid
18125flowchart TD
18126 A["👤 Ω Focus User"] -->|"1️⃣ Complete task & record"| B
18127 B["🧠 Ω Brain"] -->|"2️⃣ Process video"| C
18128 C{"🛡️ SN24 Validator API"}
18129 C -->|"3️⃣ Score submission"| A
18130 A -->|"4️⃣ List video"| E["🎥 Focus Videos Marketplace"]
18131 F["⛏️ SN24 Miner"] -->|"5️⃣ Purchase video"| E
18132 F -->|"6️⃣ Transfer TAO"| G["💰 User Wallet"]
18133 F -.->|"7️⃣ Provide tx hash"| C
18134 C -.->|"8️⃣ Verify transaction"| I
18135 I["🔍 SN24 Validator"] -.->|"9️⃣ Check purchases & set weights"| H{"⛓️ Bittensor Chain"}
18136 H -.->|"🔟 Reimburse miners"| F
18137
18138 classDef user fill:#30336b,stroke:#333,stroke-width:2px,color:white;
18139 classDef brain fill:#eeac99,stroke:#333,stroke-width:2px,color:white;
18140 classDef api fill:#e06377,stroke:#333,stroke-width:2px,color:white;
18141 classDef market fill:#c83349,stroke:#333,stroke-width:2px,color:white;
18142 classDef miner fill:#5b9aa0,stroke:#333,stroke-width:2px,color:white;
18143 classDef validator fill:#f0932b,stroke:#333,stroke-width:2px,color:white;
18144 classDef chain fill:#6ab04c,stroke:#333,stroke-width:2px,color:white;
18145 classDef external fill:#61c0bf,stroke:#333,stroke-width:2px,color:white;
18146
18147 class A user;
18148 class B brain;
18149 class C api;
18150 class D,E market;
18151 class F miner;
18152 class G user;
18153 class H chain;
18154 class I validator;
18155 class J external;
18156```
18157
18158### Scoring Algorithm
18159
18160A task completion video's final score is a geometric average of five components:
18161
18162#### gemini based scores
181631. task_gemini_score: Gemini's evaluation of the task's quality, based on the task overview and how it feeds into the community's goals and its relevance to teaching AI systems ([prompt](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/services/focus_scoring_prompts.py#L2))
181642. completion_gemini_score: Gemini's evaluation of how well the task was completed and how relevant the video content is to the task and the community's goals ([prompt](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/services/focus_scoring_prompts.py#L88))
18165
18166#### embeddding based scores
181673. task_uniqueness_score: Uniqueness of the task based on embedding similarity of the task overview with existing tasks in the system
181684. description_uniqueness_score: Uniqueness of the video description based on embedding similarity of the detailed video description with existing video annotations in the system
181695. video_uniqueness_score: Uniqueness of the video content based on embedding similarity of the video with existing videos in the system
18170
18171Each component contributes equally to the final score. We chose to use a geometric average to ensure that no individual component dominates the final score.
18172
18173You can dig into the code implementation [here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/services/scoring_service.py#L240).
18174
18175### Why so Complicated?
18176
18177Anyone experienced with Bittensor is probably asking themselves right now: why is this video submission process so convoluted? Why not just have Ω Focus users be miners and be compensated directly via the Bittensor chain's emissions each epoch? There are a few reasons:
18178
181791. Bittensor’s emissions system awards miners constantly (every epoch), and miners who do not perform well are eventually deregistered and must buy in again (optimized for consistently high performance and throughput). We see Ω Focus users completing tasks and submitting their screen recordings with irregular schedules (some days you do awesome work, some days you rest). With less consistent schedules, we don’t want temporarily inactive users to be deregistered (and subsequently have to re-register to start earning again).
181802. Therefore, Ω Labs and SN24 miners acts as intermediaries. Ω Focus users complete tasks and submit their recordings on an arbitrary schedule while SN24 miners are consistently buying up available screen recordings and submitting them to SN24 validators for verification.
181813. Once smart contracts are available on Bittensor, as Const mentioned recently, we will definitely move over to emitting rewards directly to Focus users in a fully decentralized manner.
18182
18183### Hmmm, this doesn't feel like it's fully decentralized
18184
18185Yes, we acknowledge that. Even while Smart Contracts are not available on Bittensor, there is still room for us to decentralize the scoring and purchase verification process further. Some next steps here include:
18186
181871. Use some decentralized database to store the Focus Video scores, annotations, and purchase status.
181882. Move the scoring to run locally on the validator's machines via opensource video understanding models like Qwen2-VL-72b when it becomes available or by simply having validators make requests to the Gemini API themselves in the meantime.
181893. Creating a public dashboard where anyone in the Bittensor community can view the Focus Videos and their associated scores to judge for themselves the quality of the submissions.
18190
18191All in all, this is an MVP release and we wanted to just ship something out to get the ball rolling. We are 100% committed to decentralizing the system as much as possible urgently, but also want to emphasize the novel nature of what we're implementing here and appreciate everyone's patience as we make the system more robust and decentralized.
18192
18193Learn more about the Ω Focus app in [this FAQ](https://focus.omega.inc).
18194
18195## Roadmap
18196
18197### Phase 1: Foundation (Q1 2024)
18198- [x] Launch OMEGA Labs subnet on Bittensor testnet
18199- [x] Reach 100,000 hours of footage and 3 million video clips
18200
18201### Phase 2: Expansion (Q2 2024)
18202- [x] Reach 250,000 hours of footage and 15 million video clips
18203- [x] Train and demo any-to-any models on the dataset
18204- [ ] Build synthetic data pipelines to enhance dataset quality
18205- [ ] Publish a research paper on the Bittensor-powered Ω AGI dataset
18206- [ ] Expand into running inference for state-of-the-art any-to-any multimodal models
18207
18208### Phase 3: Refinement (Q3 2024)
18209- [ ] Reach 500,000+ hours of footage and 30 million+ video clips
18210- [ ] Use the dataset to train powerful unified representation models
18211- [ ] Fine-tune any-to-any models for advanced audio-video synchronized generation
18212- [ ] Open up an auctioning page for companies and groups to bid on validation topics using various currencies (in addition to TAO)
18213- [ ] Develop state-of-the-art video processing models for applications such as:
18214 - Transcription
18215 - Motion analysis
18216 - Object detection and tracking
18217 - Emotion recognition
18218
18219### Phase 4: Application (Q4 2024)
18220- [ ] Train desktop & mobile action prediction models on the dataset
18221- [ ] Develop cross-platform digital agents MVP
18222
18223### Phase 5: Democratization (Q1 2025)
18224- [ ] Generalize the subnet for miners to upload videos from any data source
18225- [ ] Incentivize people to record and label their own data using non-deep learning approaches
18226
18227## Running Miners and Validators
18228### Running a Miner
18229#### Requirements
18230- Python 3.8+
18231- Pip
18232- GPU with at least 12 GB of VRAM or 24 GB if you'd like to run a local LLM
18233- If running on runpod, `runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04` is a good base template.
18234
18235#### Setup
182361. To start, clone the repository and `cd` to it:
18237```bash
18238git clone https://github.com/omegalabsinc/omegalabs-bittensor-subnet.git
18239cd omegalabs-bittensor-subnet
18240```
182412. Install ffmpeg. If you're on Ubuntu, just run: `apt-get -y update && apt-get install -y ffmpeg`.
182423. Install pm2 if you don't already have it: [pm2.io](https://pm2.io/docs/runtime/guide/installation/).
182434. Next, install the `omega` package: `pip install -e .`
18244
18245#### Run with PM2
18246```bash
18247pm2 start neurons/miner.py --name omega-miner -- \
18248 --netuid {netuid} \
18249 --wallet.name {wallet} \
18250 --wallet.hotkey {hotkey} \
18251 --axon.port {port} \
18252 --blacklist.force_validator_permit
18253```
18254
18255#### Tips for Better Incentive
18256The subnet has become quite competitive, and the basic miner template is no longer sufficient to earn good emissions and avoid deregistration. Here are some tips to consider improving your miner:
182571. Use proxies or frequently change your pod.
18258 a) We've heard good things about [Storm Proxies](https://stormproxies.com/).
182592. Make sure your videos are unique. You can de-duplicate your collected video with this [video ID index](https://huggingface.co/datasets/jondurbin/omega-multimodal-ids) graciously offered by Jon, one of the miners on the OMEGA subnet.
182603. Improve the descriptions you are submitting alongside your uploaded videos. You can try doing this by using video captioning models or incorporating the transcript. Lots of experimentation room here.
182614. You can use the `check_score` endpoint that we offer to check your score breakdown. See [this gist](https://gist.github.com/salmanshah1d/f5a8e83cb4af6444ffdef4325a59b489).
18262
18263#### Common Troubleshooting Tips
182641. If you've been running for several minutes and have not received any requests, make sure your port is open to receiving requests. You can try hitting your IP and port with `curl`. If you get no response, it means your port is not open.
182652. You can use our [validator logs W&B](https://wandb.ai/omega-labs/omega-sn24-validator-logs) to see how your miner is scoring in practice.
18266
18267### Running a Validator
18268#### Requirements
18269- Python 3.8+
18270- Pip
18271- GPU with at least 24 GB of VRAM
18272- If running on runpod, `runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04` is a good base template.
18273
18274#### Recommended
18275- Setting up wandb. Set environment variable with `export WANDB_API_KEY=<your API key>`. Alternatively, you can disable W&B with --wandb.off
18276
18277#### Setup
182781. To start, clone the repository and `cd` to it:
18279```bash
18280git clone https://github.com/omegalabsinc/omegalabs-bittensor-subnet.git
18281cd omegalabs-bittensor-subnet
18282```
182832. Install ffmpeg. If you used the runpod image recommended above, ffmpeg is already installed. Otherwise, if you're on Ubuntu, just run: `apt-get -y update && apt-get install -y ffmpeg`.
182843. Install pm2 if you don't already have it: [pm2.io](https://pm2.io/docs/runtime/guide/installation/).
182854. Next, install the `omega` package: `pip install -e .`
18286
18287#### Run auto-updating validator with PM2 (recommended)
18288```bash
18289pm2 start auto_updating_validator.sh --name omega-validator -- \
18290 --netuid {netuid} \
18291 --wallet.name {wallet} \
18292 --wallet.hotkey {hotkey} \
18293 --axon.port {port} \
18294 --logging.trace
18295```
18296Note: you might need to adjust "python" to "python3" within the `neurons/auto_updating_validator.sh` depending on your preferred system python.
18297
18298#### Run basic validator with PM2
18299```bash
18300pm2 start neurons/validator.py --name omega-validator -- \
18301 --netuid {netuid} \
18302 --wallet.name {wallet} \
18303 --wallet.hotkey {hotkey} \
18304 --axon.port {port} \
18305 --logging.trace
18306```
18307
18308## Contributing
18309
18310We believe in the power of community and collaboration. Join us in building the world's largest decentralized multimodal dataset for AGI research! Whether you're a researcher, developer, or data enthusiast, there are many ways to contribute:
18311
18312- Submit high-quality videos and annotations
18313- Develop and improve data validation and quality control mechanisms
18314- Train and fine-tune models on the dataset
18315- Create applications and tools that leverage the dataset
18316- Provide feedback and suggestions for improvement
18317
18318To get started, please see our [contribution guidelines](./CONTRIBUTING.md) and join our vibrant community on [Discord](https://discord.gg/opentensor).
18319
18320## License
18321
18322The OMEGA Labs Bittensor subnet is released under the [MIT License](./LICENSE).
18323
18324---
18325
18326🌟 Together, let's revolutionize AGI research and unlock the full potential of multimodal understanding! 🌟
18327</div>
18328
18329
18330
18331---
18332File: /setup.py
18333---
18334
18335# The MIT License (MIT)
18336# Copyright © 2023 Yuma Rao
18337# Copyright © 2023 Omega Labs, Inc.
18338
18339# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
18340# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
18341# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
18342# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
18343
18344# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
18345# the Software.
18346
18347# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
18348# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18349# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
18350# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
18351# DEALINGS IN THE SOFTWARE.
18352
18353import re
18354import os
18355import codecs
18356from os import path
18357from io import open
18358from setuptools import setup, find_packages
18359
18360
18361def read_requirements(path):
18362 with open(path, "r") as f:
18363 requirements = f.read().splitlines()
18364 return requirements
18365
18366
18367requirements = read_requirements("requirements.txt")
18368here = path.abspath(path.dirname(__file__))
18369
18370with open(path.join(here, "README.md"), encoding="utf-8") as f:
18371 long_description = f.read()
18372
18373# loading version from setup.py
18374with codecs.open(
18375 os.path.join(here, "omega/__init__.py"), encoding="utf-8"
18376) as init_file:
18377 version_match = re.search(
18378 r"^__version__ = ['\"]([^'\"]*)['\"]", init_file.read(), re.M
18379 )
18380 version_string = version_match.group(1)
18381
18382setup(
18383 name="omega_bittensor_subnet",
18384 version=version_string,
18385 description="omega_bittensor_subnet",
18386 long_description=long_description,
18387 long_description_content_type="text/markdown",
18388 url="https://github.com/omegalabsinc/omega-bittensor-subnet",
18389 author="Omega Labs, Inc.",
18390 packages=find_packages(),
18391 include_package_data=True,
18392 author_email="[email protected]",
18393 license="MIT",
18394 python_requires=">=3.8",
18395 install_requires=requirements,
18396 classifiers=[
18397 "Development Status :: 3 - Alpha",
18398 "Intended Audience :: Developers",
18399 "Topic :: Software Development :: Build Tools",
18400 "License :: OSI Approved :: MIT License",
18401 "Programming Language :: Python :: 3 :: Only",
18402 "Programming Language :: Python :: 3.8",
18403 "Programming Language :: Python :: 3.9",
18404 "Programming Language :: Python :: 3.10",
18405 "Topic :: Scientific/Engineering",
18406 "Topic :: Scientific/Engineering :: Mathematics",
18407 "Topic :: Scientific/Engineering :: Artificial Intelligence",
18408 "Topic :: Software Development",
18409 "Topic :: Software Development :: Libraries",
18410 "Topic :: Software Development :: Libraries :: Python Modules",
18411 ],
18412)
18413
18414
18415
18416---
18417File: /test_audio_dataset.py
18418---
18419
18420import os
18421from datasets import load_dataset
18422from huggingface_hub import login
18423from io import BytesIO
18424import soundfile as sf
18425
18426# Set HF_TOKEN environment variable or pass directly
18427HF_TOKEN = os.getenv('HF_TOKEN')
18428
18429# Login to Hugging Face
18430# login(token=HF_TOKEN)
18431
18432# Load the dataset
18433dataset = load_dataset("tezuesh/diarization_dataset", token=HF_TOKEN)
18434
18435print(f"Dataset loaded successfully with {len(dataset)} examples")
18436# Get first row from the dataset
18437first_row = dataset['train'][0]
18438print("\nFirst row of dataset:")
18439# print(first_row)
18440print("\nKeys in dataset:")
18441print("\nLength of values in first row:")
18442for key in first_row.keys():
18443 if isinstance(first_row[key], list):
18444 print(f"{key}: {len(first_row[key])}")
18445 else:
18446 print(f"{key}: {first_row[key]}")
18447
18448
18449
18450import librosa
18451import numpy as np
18452breakpoint()
18453audio_bytes = first_row['audio_bytes']
18454audio_arr, sr = sf.read(BytesIO(audio_bytes))
18455print(len(audio_arr), type(audio_arr))
18456audio = np.array(audio_arr)
18457# exit()
18458print(audio.shape)
18459
18460youtube_id = first_row['youtube_id']
18461os.makedirs('Dataset_audios/Original', exist_ok=True)
18462sf.write(f'Dataset_audios/Original/{youtube_id}.wav', audio, sr)
18463
18464diar_timestamps_start = first_row['diar_timestamps_start']
18465diar_timestamps_end = first_row['diar_timestamps_end']
18466diar_speakers = first_row['diar_speakers']
18467
18468for start, end, speaker in zip(diar_timestamps_start, diar_timestamps_end, diar_speakers):
18469 # Calculate start and end samples
18470 start_sample = int(start * sr)
18471 end_sample = int(end * sr)
18472
18473 # Extract the clip
18474 clip = audio[start_sample:end_sample]
18475
18476 # Create output directory if it doesn't exist
18477 os.makedirs(f'Dataset_audios/Clips/{youtube_id}', exist_ok=True)
18478
18479 # Save the clip with speaker and timestamp info in filename
18480 clip_filename = f'Dataset_audios/Clips/{youtube_id}/speaker_{speaker}_{start:.2f}-{end:.2f}.wav'
18481 sf.write(clip_filename, clip, sr)
18482
18483
18484# Create a list to store the diarization data
18485diarization_data = []
18486for start, end, speaker in zip(diar_timestamps_start, diar_timestamps_end, diar_speakers):
18487 diarization_data.append({
18488 'youtube_id': youtube_id,
18489 'start_time': start,
18490 'end_time': end,
18491 'speaker': speaker,
18492 "duration": end - start
18493 })
18494
18495# Convert to pandas DataFrame and save as CSV
18496import pandas as pd
18497df = pd.DataFrame(diarization_data)
18498os.makedirs('Dataset_audios/Metadata', exist_ok=True)
18499df.to_csv(f'Dataset_audios/Metadata/{youtube_id}_diarization.csv', index=False)
18500