Nous hermes 13b reddit.
Is the Nous-Hermes-2-Solar-10.
Nous hermes 13b reddit e. /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. At the 70b level, Airoboros blows both versions of the new Nous models out of the water. ). blogspot. I've got a feeling I wouldn't notice the censorship so it's worth checking this one out I suppose. - LLaMA2-13B-Tiefighter and MythoMax-L2-13b for when you need some VRAM for other stuff. Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. I just tried Nous Hermes 13b a bit and I noticed he gets incoherent faster after 2k tokens. It replaced my previous favorites that Because apart the RAM usage i didn't find improvements, i tried 7B, 13B, 30B at q2, maybe i'm doing something wrong. Especially when loading with Exllama-HF EDIT, I meant NOUS hermes, not chronos, these all blend together. I've made a playground with a bunch of the top 13B models (OpenOrca, Airoboros, Nous-Hermes, Vicnua etc. My last post was almost two weeks ago (I know, it's an eternity in LLM land), and I updated it last week with Nous Hermes 2 - Mixtral 8x7B. According to him. 6B to 120B: StableLM 2 Zephyr 1. 7~11. Interestingly, both Pygmalion 13b and Mythomax 13b can't solve the puzzle by themselves but merge between them can. They're both solid 13b models that still perform well and are really fast. After testing so many models, I think "general intelligence" is a - or maybe "the" - key to success: The smarter a model is, the less it seems to suffer from the repetition issue. Custom Dataset Enriched with Function Calling: Our model's training data includes a unique feature – function calling. After going through many benchmarks, and my own very informal testing I've narrowed down my favorite LLaMA models to Vicuna 1. 7b. 2 and Nous-Hermes-2-Yi-34B. Still trying to find settings I like for MythoMax but it’s been well tuned for uncensored creative storytelling/role play from my experience. By default, Ollama uses 4-bit quantization. I need something lightweight that can run on my machine, so maybe 3B, 7B or 13B. Meanwhile, ChatGPT failed at simple tasks that Hermes figured out. Or check it out in the app stores models like Mixtral which have more sensitive distributions, unless you can dial in juuust the right combo. The role-playing chat I've been doing with the Nous Hermes Llama2 13B GGML model have been just amazing. I've been looking into and talking about the Llama 2 repetition issues a lot, and TheBloke/Nous-Hermes-Llama2-GGML (q5_K_M) suffered the least from it. This distinctive addition transforms Nous-Hermes-2-Vision into a Vision-Language Action Model. Vicuna 1. 6 Mitral 8x7b Airoboros 70B Broody's Story Brainstorming LLM I loved OpenOrcaxOpenChat-Preview2-13B when it came out, and I thought for a while it will be my "main" model to use, and when it gets it, it gets it better than Nous-Hermes, but I started to notice it has a reverse-literalism problem, where sometimes I ask a very specific question that can only be interpreted one way, and it instead assumes it I haven’t used Vicuna personally but I second MythoMax and Nous-Hermes. q5_K_M openorca-platypus2-13b. I have tried many, my favorite 13b model is the nous-hermes-llama2-13b. But nicely descriptive! Hermes-LLongMA-2-13B-8Ke: Doesn't seem as eloquent or smart as regular Hermes, did less emoting, got confused, wrote what User does, showed misspellings. This list is poorly tested (0-1 shots). Sometimes even common, short verb conjugations go missing (am, are, etc. I like Nous-Hermes-Llama2-13B, but after so long it starts outputting sentences which lack prepositions. Reply reply More replies beetroot_fox I've run a few 13b models on an M1 Mac Mini with 16g of RAM. Nous Hermes 13b is very good. I also want to know how the loser characters would die to the winners. Is the Nous-Hermes-2-Solar-10. But I expected both to do better than that. model-specific prompt format Mythomax and Nous-Hermes-2-SOLAR showed perplexing responses sometimes, mentioning things that made no sense. Using this settings, no OOM on load or during use and context sizes reaches up to 3254~ and hovers around that value with max_new_token set to 800. as long as they are high quality and they aren't against Reddit ToS. So not ones that are just good at roleplaying, unless that helps with dialogue. Out of all the models I've been trying so far in ST, I've been having the best results so far with Chronos Hermes 13B. 0 (and it's uncensored variants), and Airoboros 1. What's more exciting is that we've expanded the token limit up to a whopping 3500, instead of the standard 1800. q4_k_m - 13B Fimbulvetr-11B-v2. Expand user menu Open settings menu. Log In / Sign Up; Impressive given 13b vs 7b. 0 13b wizard uncensored llama 2 13b Nous-hermes llama 1 13b (slept on abilities with right prompts) Wizardcoder-guanaco-15b upstage/llama-2048 instruct (strongest llama 1 model, except for coding, it is close to many 70b models Example: ollama run nous-hermes:13b-q4_0. 2. Skip to main content. 3, WizardLM 1. And many of these are 13B models that should work well with lower VRAM count GPUs! I recommend trying to load with Exllama (HF if possible). Q4_K_M- 13B Xwin-MLewd-13B-V0. Or check it out in the Using a 3060 (12GB VRAM) >Nous-Hermes-13B max_seq_len = 4096. Subjectively speaking, Mistral-7B-OpenOrca is waay better than Luna-AI-Llama2-Uncensored, but WizardLM-13B-V1. Probably best to stick to 4k context on these. I'll check few models again with different ST settings: New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. I find the former to be quite decent but sometimes I notice that it traps itself in a loop by repeating the same scene all over again, while the latter seems to be more prone with messing up details. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the Get the Reddit app Scan this QR code to download the app now. If you don't have the hardware, or want to stick with cloud-based models, you could look at Mytholite on Mancer. It's quality, diversity and scale is unmatched in the current OS LM landscape. Even in those puzzling responses they didn't seem to allow any disrespectful behaviour but They didn't make sense. Full offload GGML performance has come a long way and is fully dolphin, airoboros and nous-hermes have no explicit censorship — airoboros is currently the best 70b Llama 2 model, as other ones are still in training. Log In / Sign Up; Advertise on Reddit; Shop Collectible Avatars; nous-hermes-llama2-13b. 5 dataset, surpassing all Open Hermes and Nous Hermes models of the past, trained over Yi 34B with others to come! We achieve incredible benchmarks and surpass all of the previous Open Hermes and Nous Hermes in all scores. Thanks for all the tips. com Open. Since the beginning of the year, I've built multiple custom nodes for ComfyUI, translated scripts from PowerShell to Python, and started to build a text parser for document and web page analysis. 1% of Hermes I installed Nous-Hermes-13B-GGML & WizardLM-30B-GGML using the instructions in this reddit post. Until the 8K Hermes is released, I think this is the best it gets for an instant, no-fine-tuning chatbot. Solar hermes is generally the worst mainstream solar finetune I know of. Chronos-Hermes-13B-v2: More storytelling than chatting, sometimes speech inside actions, not as smart as Nous-Hermes-Llama2, didn't follow instructions that well. At lower settings, it seems to end up eventually just free associating a bunch of words, almost like random (but not completely random, they are always somehow connected to each other and the story). When I ask Nous Hermes 13b to write a violent sexual scene it does it without complaining. Reply reply I've been using Hermes so far which seems to be the most coherent. Big LLM Score Update: TULU, Camel, Minotaur, Nous Hermes, Airoboros 1 When I've tested Nous Hermes, There new model on the block called Camel 59 votes, 60 comments. More info: Get the Reddit app Scan this QR code to download the app now. I can even add multiple characters to the context and it works and juggles all of them usually quite successfully! Will produce dialogue and actions for each character. 7B uncensored? I want to do some death battle scenarios with some fictional characters without being lectured about morality. I use Wizard for long, detailed responses and Hermes for unrestricted responses, which I will use for horror(ish) novel research. I just uploaded the Puffin benchmarks and I can confirm Puffin beats Hermes-2 for the #1 spot in even popular single-turn benchmarks like Arc-E, Winogrande, Hellaswag, and ties Hermes-2 in PIQA. We are now offering you the opportunity to test the Nous-Hermes-Llama2-13b model, which has been finely tuned to elevate your Role Playing experience. I have been testing out the new generation of models (airoboros 70B, nous hermes llama2, chronos hermes) So far, the models I've tried out are reluctant to use explicit language, no matter what characters I use them with. The model card lists the two experts as bagel-dpo-34b-v0. Go figure. I don't use these models enough to tell if the general quality did change. I double-checked to make sure my context/instruct settings were right, textgen settings too, and yet despite everything being ok I could barely get a few posts into the roleplay before things began to nosedive into uselessness. The main Models I use are wizardlm-13b-v1. I find the 13b parameter models to be noticeably better than the 7b models although they run a bit slower on my computer (i7-8750H and 6 GB GTX 1060). g. 70B: Xwin-LM-70B-V0. The main limitiation on being able to run a model in a GPU seems to be its A state-of-the-art language model fine-tuned on over 300k instructions by Nous Research, with Teknium and Emozilla leading the fine tuning process. VRAM usage sits around 11. Every single model I load has an out of memory error; I've done 4bit quant 30b/33b models and 13b models. 2 on Apple Silicon macs with >= 16GB of RAM for a while now. i1-Q4_K_M 3 List of failed models with varying amounts of errors, almost all started write for user at some point, probably because of not optimal ST settings. ) available to compare side by side. Further fine tuning Nous-Hermes-llama-2-7b? Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. OrcaMini is Llama1, I’d stick with Llama2 models. 0 for censored general instruction-following. But it's really valuable to see the outputs side-by-side. Would like to see a Nous Hermes 2 Miqu! serpdotai/sparsetral-16x7B-v2 HF, 4K context, ChatML format: Gave correct answers to only 3+3+4+5=15/18 multiple choice questions! Just the questions, no previous information, gave correct answers: 1+1+0+5=7/18 I don't know who is experiencing repetition issues or not since, there hasn't been a post for 26 days Nous-Hermes-Llama-2 13B GGUF model with repetition seeming to still being somewhat inevitable. Or check it out in the Currently on Nous-Hermes-Kimiko-13B and it's working pretty well. I'll report back with my impression once I've tested this Everything Hermes failed, ChatGPT failed just as much. 1, Synthia-70B-v1. Log In / Sign Up; Advertise on Reddit; Shop Collectible Avatars; Get the Reddit app Scan this QR code to download the app now. But sometimes I'd problem made those creative model (Nous-Hermes,chronos, airoboros) follow instruction, those one My top three are (Note: my rig can only run 13B/7B): - wizardLM-13B-1. They aren't explicitly trained on NSFW content, so if you want that, it needs to be in This is a follow-up to my previous posts here: New Model RP Comparison/Test (7 models tested) and Big Model Comparison/Test (13 models tested) Originally planned as a single test of 20+ models, I'm splitting it up in two segments to keep the post managable in size: First the smaller models (13B + 34B), then the bigger ones (70B + 180B). It's Token issue with Nous-Hermes-Llama2-13b Question I'm using this model for privateGPT but when it generate prompts it keeps saying there's a 512 token limit with the model, but if I look at it's huggingface repo it says it's 4096 what can I do about this? Nous- Hermes & Puffin (13b) having opposite opinions I was testing some models with random questions I had to see differences, and I've found a curious difference: When you as how you should defrost a frozen meal (in a glass container), they both prefer different approaches: My favorite so far is Nous Hermes LLama 2 13B*. The Hermes 2 model was trained on 900,000 instructions, and surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks!Hermes 2 changes the game with strong multiturn chat skills, system prompt capabilities, and uses ChatML format. 0 - Nous-Hermes-13B - Selfee-13B-GPTQ (This one is interesting, it will revise its own response. I'm basically using the completion API but using the fastchat and alpaca instruction formats kind of . It’s a mix of Nous-Hermes (very good) + Chronos (to make it more creative in theory). I even tried forcing outputs to start a certain way, but it's still too "clean" to have any fun with. I also personally like Chronos-Hermes 13B. q4 means 4-bit quantization). 12gb is sufficient for 13b full offload using the current version of koboldcpp, as well as using exllama. tech (free 13b model, 2. It maybe helps it's prose a little, but it gives the base model a serious downgrade in IQ I'm running ooba Text Gen Ui as backend for Nous-Hermes-13b 4bit GPTQ version, with new exllama and exllama-hf, it's real fast on my local 3060. 6B, DiscoLM German 7B, Mixtral 2x7B, Beyonder, Laserxtral, and MegaDolphin 120B. Get app Get the Reddit app Log In Log in to Reddit. mythomax-l2-13b. I've run my usual tests and updated my rankings with a diverse mix of 6 new models from 1. Or check it out in the app stores Psyfighter v2 13B Perplexity PPLX 70B Pygmalion Remm SLERP 13B Mistral 8x7B Nous Hermes 2 - Yi 34 B Dolphin 2. Nous-Hermes-Llama2. Though most of the time, the first response is good enough. They were specifically for the airoboros-l2–13B-m2. 5-16K Big Model Comparison/Test (13 models tested) Winner: Nous-Hermes-Llama2 SillyTavern's Roleplay preset vs. compress_pos_emb = 2. 📅 Developed using over 1 million examples from GPT-4 and various open-source data collections. For those of you haven't tried it, do -- its worth it. It is instructed on the uncensored wizardlm instruction set as well I think, I saw it mention the evol instruct set. Releasing Hermes-LLongMA-2 8k, a series of Llama-2 models, trained at 8k context length using linear positional interpolation Nous is very hit a miss with their datasets at times. But now it's time for a new one. We won't ban you, So remember that all quantizations are methods of explaining what accuracy was given up from the raw baseline model ( 32bit or 16 bit depending on the model in question ) where the bits dedicated to the tensors are decreased. The replies aren't as long as Poe's, but they're well written, in character, and with little to no repetition, although I sometimes I would start with Nous-Hermes-13B for uncensored, and wizard-vicuna-13B or wizardLM-13B-1. 🏆 This could be the leading open-source Large Language Model (LLM) with its superior quality blends. I've been using it to help me with writer's block as well as a starting point for writing blog posts. I wouldn’t consider current Puffin to be a successor to Hermes per se, but rather a side grade, a branch of a different style that some people might like over Hermes depending on their preference and use case, and Vice verse. This is version 2 of Nous Research's line of Hermes models, and Nous Hermes 2 builds on the Open Hermes 2. Get app Get the Reddit app Log In Log in to Big LLM Score Update: TULU, Camel, Minotaur, Nous Hermes, View community ranking In the Top 5% of largest communities on Reddit. Honorable EstopianMaid is another good 13b model, while Fimbulvetr is a good 10. Thanks for training/sharing this @NousResearch. But it takes a longer time to arrive at a final response. I'm afraid none of them will get you verbatim facts without some risk of hallucination, but in general the larger the model / less heavily quantized, the higher the "resolution". This model (13B version) works better for me than Nous-Hermes-Llama2-GPTQ, which can handle the long prompts of a complex card (mongirl, 2851 tokens with all example chats) in 4 out of 5 try. Yes I've tried Samantha the editor, and my results with it were very very poor compared to whatever else I've tried. Even when my character card is totally OK with something like that. It tops most of the 13b models in most benchmarks I've seen it in (here's a compilation of Nous- Hermes & Puffin (13b) having opposite opinions I was testing some models with random questions I had to see differences, and I've found a curious difference: When you as how you New unfiltered 13B: OpenAccess AI Collective's Wizard Mega 13B. . Just having a greeting message isn't enough to get it to copy the style, ideally your character card should include examples and your own first message should also look like what you want to get back. 2b, Nous-Hermes-Llama2-70B 13B: Mythalion-13B But MXLewd-L2-20B is fascinating me a lot despite the technical issues I'm having with it. Open menu Open navigation Go to Reddit Home. Edit: These 7B and 13B can run on Colab using GPU with a much faster speed than 2 tokens/s. I've not had much luck getting mistral to behave though. Can't speak for chronos Hermes but I love base nous Hermes, I think it's better than wizard vicuna by a bit. Before I got into open-source-ish models (since Llama-2 has restrictions and LLaMA even worse), Bard had a bad problem with repetition. 5 I believe it also did relatively well, but the Llama-2 chat has some Posted by u/themanyquestionsman - 1 vote and no comments New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond Join us as we delve into the intricacies of Hermes 13B, exploring its technical specifications, training data insights, practical applications and API setup. I tried various loaders like exllama and the others in the dropdown that I recognized the name of. Narrate this using active narration and descriptive visuals. and nous-hermes-llama2-13b. Maybe there's a secret sauce prompting technique for the Nous Thanks to our most esteemed model trainer, Mr TheBloke, we now have versions of Manticore, Nous Hermes (!!), WizardLM and so on, all with SuperHOT 8k context LoRA. 8 GB with other apps such as steam, 20 or so chrome tabs with a twitch stream in the background. The number after the q represents the number of bits used for quantization (i. Log In / Sign Up; Advertise on Reddit; Shop Collectible Avatars; Nous Hermes Llama 2 13B (GGML q4_0) 16GB docker compose -f docker I occasionally use Nous-Hermes-13b or Manticore-13b-chat-pyg. I've noticed that MythoMax-L2-13B needs more guidance to use actions/emotes than e. It doesn't get talked about very much in this subreddit so I wanted to bring some more attention to Nous Hermes. Nothing works. Different models require slightly different prompts, like replacing "narrate" with "rewrite". ggmlv3. Its a merge between our Unfortunately, while this model does write quite well, it still only takes me about 20 or so messages before it starts showing the same "catch phrase" behavior as the dozen or so other LLaMA 2 models I've tried. 2's text generation still seems better I have found that the Nous-Hermes-Llama2-13b model is very good for NSFW, provided I set the "TEMPERATURE" setting as high as possible. I'm curious how the latest Nous Hermes 2 Mistral compares to Mistral 7B v0. ) My entire list at: Local LLM Comparison Repo Greetings everyone, We have some great news for all our Role Playing enthusiasts. It provides a good balance between speed and instruction following. For 7b and 13b I definitely prefer it. 5k context though), whatever happens to be available on Kobold Horde at any given moment (bit of a dice-roll on that one, but you can often get lucky), or see if you can convince Google Gemini to do what you want via Makersuite I just released an update to my macOS app to replace the 7B Llama2-Uncensored base model with Mistral-7B-OpenOrca. 🥇 It’s the premier refined version of Mixtral 8x7B, surpassing the original Mixtral Instruct. [ Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. NousResearch has recently unveiled the Nous-Hermes-2-Mixtral-8x7B. To try other quantization levels, please try the other tags. Developers now have a versatile tool at their disposal, primed for crafting a myriad of ingenious automations. I can only has same success with chronos-hermes-13B-GPTQ_64g. My usual prompt goes like this: <Description of what I want to happen>. r/LocalLLaMA A chip A close button. Nous-hermes-70b wizard uncensored 1. So I'm basically wondering is there any 13B models that are really good at this, such as chat uncensored, orca, nous hermes, or are they kind of severely lacking next to their 70B counterparts to a degree where it might make more sense to use an API, or website to access something larger for this more occasional use. model-specific prompt format I'll do a comparison between Hermes-LLongMA-2-13B-8K with either scaling method. 7b and found that it quickly devolved into the bot endlessly repeating itself regardless of settings. 7b capybara was solid AF. Personally I've been enjoying OpenOrca a lot. 0-GPTQ model but they’re based on similar p settings that inproved nous Hermes 13B for me too, good luck. Chat with Hermes 13B I expected better roleplaying from Nous given how good their models are at the 13b level for that. I guess these new models are still "fresh behind the ears". Share Add a Comment. It's quick, usually only a few seconds to begin generating a response. But if I ask the same to Nous Hermes 13b superHOT 8k it gives me "ethical" advice or just refuses to do it. cpp based Space! I'm finding it to be as good or better than Vicuna/Wizard Vicuna/Wizard-uncensored models in almost every case. It sort of managed to solve my logic puzzle that stumbles other LLMs ( even GPT4 ). This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the Get app Get the Reddit app Log In Log in to Reddit. For the 34b, I suggest you choose Exllama 2 quants, 20b and 13b you can use other formats and they should still fit in the 24gb of VRAM. I have your same setup (64+12) but I'd rather stay with 13B using the vram as much as possible. If you want to upgrade, best thing to do would be vram upgrade, so like a 3090. Includes a llama. I've searched on here for info but I can't figure it out. You could also try some of the 2x7b merges, such as Blue-Orchid-2x7b or DareBeagel-2x7b. I've tested Mythalion 13b, seems like a good replacement for Nous Hermes 2 13b ( my normal go to model ). Yes I’m part of NousResearch, the person I’m responding to has already tried Hermes-2, so I’m encouraging them to now try Puffin. Let’s uncover the answers to these questions and more. Looks like the DPO version is better than the SFT. It also reaches within 0. How Nous-Hermes-13B AI Model Can Help You Generate High-Quality Content and Code socialviews81. The narrative doesn't really go anywhere but it's mostly coherent and I could probably lead it in a direction if I tried. And it has had support for WizardLM-13B-V1. Know a guy who tried a bunch of different 7 and 13b models and chronos Hermes was reliably the best at carrying a plot, responding to various scenarios, and doesn't have any censorship. Nous Hermes L2 13B-4 bits, has me really surprised, been using it for many days and now is my clear favorite. q5_K_M Thank you! Reply reply I've tried with hermes, mixtral and even miqu testing for accuracy of Wikipedia "facts". 13B is able to more deeply understand your 24Kb+ (8K tokens) prompt file of corpus/FAQ/whatever compared to the 7B model 8K release, and it is phenomenal at answering questions on the material you provide it. q5_K_M version of Nous Hermes 13b because I was curious if the lower perplexity would make a difference: /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the I just tried doing a scene using nous-hermes-2-solar-10. Having a 20B that's faster than the 70Bs and better than the 13Bs would be very welcome. Note this can be very tight on windows due to background vram usage. That's unusual. Like every single model I used, except for Nous Hermes without using an instruction prompt, it understands I want to sign the mail as "Quentin" or similar. 4 (we need more benchmarks between the three!). The 4K Nous-Hermes-Llama2 is my current favorite Llama 2 model, but the 8K just didn't work as well for me, so hopefully NTK-Aware Scaling can bring it on par with the orignal. I'm mostly looking for ones that can write good dialogue and descriptions for fictional stories. 2? What are the differences and optimizations Nous is doing on top of the base model? I'm also curious if there are any rankings/ evals for writing style / creative writing? I always have to go through a bunch of random posts trying to figure out what people are using. Atleast that’s what Nous-Hermes-Llama-2-13b Puffin 13b Airoboros 13b Guanaco 13b Llama-Uncensored-chat 13b AlpacaCielo 13b There are also many others. Of the 7Bs, OpenHermes often gave the most "calm" responses. model-specific prompt format New Model Comparison/Test (Part 1 of 2: 15 models tested, 13B+34B) Winner: Mythalion-13B New Model RP Comparison/Test (7 models tested) Winners: MythoMax-L2-13B, vicuna-13B-v1. ybbmvfhfytngpauyrpvsjbugsieuzpjqhduagwlxfszpugj