Mentions (30d)
0
Reviews
0
Platforms
2
Sentiment
0%
0 positive
Industry
information technology & services
Employees
470
Gmail connector
Gmail connector is broken for drafts and sending Asked ChatGPT to draft an email. Gave it the recipient, subject, body, everything. Instead of just...creating the draft, it started searching my mailbox in a loop and then failed. Same thing happens if you ask it to send. It never needed to read my inbox. I told it exactly what to write and who to send it to. Why is it searching my mail? Why is it scraping my contacts? What is it searching for? There is zero observability and this feels like a serious breach of privacy. Tried to report it through the support bot and the bot crashed. So that's fun. Anyone else seeing this or is it just me? submitted by /u/TheExodu5 [link] [comments]
View originalOpenAI said ads were a "last resort." Then crossed $100M in 6 weeks.
Remember when Altman literally said in 2024 that ads are a last resort for them? Well. Here we are. What gets me isn’t the $100M itself — it’s that they hit it while the product is basically still in beta. Less than 20% of users see ads daily. No self-serve tools yet. No international rollout yet. 600 advertisers but most needed a $200K minimum just to get in. They haven’t even opened the floodgates and it’s already nine figures. The part I keep thinking about: Google built an empire on search intent — people typing what they want. ChatGPT has something different. People explain their whole situation to it. That’s a completely different level of signal for an advertiser. Whether they can scale this without killing the trust that makes the product work in the first place — that’s the actual story. submitted by /u/monotvtv [link] [comments]
View originalHow do you find your old threads with your context?
I create loads of new threads often to stretch my usage more on my tier and I know there is title and content search in gpt but isnt it just simple regex, is there any way to like enter what I am looking for and it searches with AI since I don't know exact sentence matches to filter it down and the memories don't have full context so I can't just start a new chat its not the same submitted by /u/MontyOW [link] [comments]
View originalWhy is Chat-GPT doing this?
submitted by /u/thenewme47 [link] [comments]
View originalKept hitting ChatGPT and Claude limits during real work. This is the free setup I ended up using
I do a lot of writing and random problem solving for work. Mostly long drafts, edits, and breaking down ideas. Around Jan I kept hitting limits on ChatGPT and Claude at the worst times. Like you are halfway through something, finally in flow, and boom… limit reached. Either wait or switch tools and lose context. I tried paying for a bit but managing multiple subscriptions felt stupid for how often I actually needed them. So I started testing free options properly. Not those listicle type “top 10 AI tools” posts, but actually using them in real tasks. After around 2 to 3 months of trying different stuff, this is what stuck. Google AI Studio is probably the one I use the most now. I found it by accident while searching for Gemini alternatives. The normal Gemini site kept limiting me, but AI Studio felt completely different. I usually dump full notes or messy drafts into it and ask it to clean things up or expand sections. It handles long inputs way better than most free tools I tried. I have not really hit a hard limit there yet during normal use. For research I use Perplexity free. It is not perfect, sometimes the sources are mid, but it is fast enough to get direction. I usually double check important stuff anyway. Claude free I still use, but only when I want that specific tone. Weirdly I noticed the limits reset separately on different browsers. So I just switch between Chrome and Edge when needed. Not a genius hack, just something that ended up working. For anything even slightly sensitive, I use Ollama locally. Setup took me like 10 to 15 minutes after watching one random YouTube video. It is slower, not gonna lie, but no limits and I do not have to worry about uploading private stuff. I also tried a bunch of other tools people hype on Twitter. Some were decent for one or two uses, then just annoying. Either too slow or randomly restricted. Right now this setup covers almost everything I actually do day to day. I still hit limits sometimes, but it is way less frustrating compared to before. I was paying around 60 to 80 dollars earlier. Now it is basically zero, and I am not really missing much for the kind of work I do. I made a full list of all 11 things I tested and what actually worked vs what was overhyped. Did not want to dump everything here. submitted by /u/Akshat_srivastava_1 [link] [comments]
View originalI posted this in the r/GeminiAI and it was instantly removed by the mods.
Why is Gemini so bad? Apologies for the click bait title, and I know most of you will probably downvote me immediately, but hear me out. I use Gemini through my now $20/mo (was $25) plan. Something I was already paying for because I have an Android phone and all that. I also have the $200/mo OpenAI plan since Codex is my CLI coder of choice. I will routinely ask ChatGPT and Gemini the same question to compare results. Even when I have it set to Pro, Gemini will respond almost instantly. ChatGPT takes a lot longer to respond, but you can watch it actually searching the web, getting up to date information, etc. And when you compare the final answers, Gemini's is always much less thought out, misses a lot of nuance or edge cases that ChatGPT found, and is frequently just outright wrong. Given that Gemini is from Google, you know, THE search company, I always thought that the one place it would always have the edge is it's ability to search the internet for the most accurate, latest information before responding. But it seems like it won't even bother unless I really guide it and instruct it to do so, while ChatGPT alnost always just does it. Maybe I'm not being fair because I'm comparing a $20 plan to a $200 plan, but it really worries me how often Gemini is wrong if there are a lot of people out there that just use that and trust it. Thoughts? submitted by /u/TaylorHu [link] [comments]
View originalI am seeing Claude everywhere
Every single Instagram reel or TikTok I scroll i see people mentioning Claude and glazing it like it’s some kind of master tool that’s better than every single other ai assistant. do they run a strong marketing program or is it really that good in contrast to other ai tools? Before i started seeing it for the first time i only heard that it’s a little better for coding, but know i see it everywhere. I've tried it too, but it doesn’t seem to be much different than ChatGPT to me. Is it actually this powerful at the moment? + Not to mention that many people also hate on ChatGPT too. Though it’s still the best one for me (edit): i have never searched for it and I dont think that my algorithm is set to appear claude videos. I believe that it’s viral in general and I know you guys agree submitted by /u/alpinezhx [link] [comments]
View originalUpload Yourself Into an AI in 7 Steps
A step-by-step guide to creating a digital twin from your Reddit history STEP 1: Request Your Data Go to https://www.reddit.com/settings/data-request STEP 2: Select Your Jurisdiction Request your data as per your jurisdiction: GDPR for EU CCPA for California Select "Other" and reference your local privacy law (e.g. PIPEDA for Canada) STEP 3: Wait Reddit will process your request. This can take anywhere from a few hours to a few days. STEP 4: Extract Your Data Receive your data. Extract the .zip file. Identify and save your post and comment files (.csv). Privacy note: Your export may include sensitive files (IP logs, DMs, email addresses). You only need the post and comment CSVs. Review the contents before uploading anything to an AI. STEP 5: Start a Fresh Chat Initiate a chat with your preferred AI (ChatGPT, Claude, Gemini, etc.) FIRST PROMPT: For this session, I would like you to ignore in-built memory about me. STEP 6: Upload and Analyze Upload the post and comment files and provide the following prompt with your edits in the placeholders: SECOND PROMPT: I want you to analyze my Reddit account and build a structured personality profile based on my full post and comment history. I've attached my Reddit data export. The files included are: - posts.csv - comments.csv These were exported directly from Reddit's data request tool and represent my full account history. This analysis should not be surface-level. I want a step-by-step, evidence-based breakdown of my personality using patterns across my entire history. Assume that my account reflects my genuine thoughts and behavior. Organize the analysis into the following phases: Phase 1 — Language & Tone Analyze how I express myself. Look at tone (e.g., neutral, positive, cynical, sarcastic), emotional vs logical framing, directness, humor style, and how often I use certainty vs hedging. This should result in a clear communication style profile. Phase 2 — Cognitive Style Analyze how I think. Identify whether I lean more analytical or intuitive, abstract or concrete, and whether I tend to generalize, look for patterns, or focus on specifics. Also evaluate how open I am to changing my views. This should result in a thinking style model. Phase 3 — Behavioral Patterns Analyze how I behave over time. Look at posting frequency, consistency, whether I write long or short content, and whether I tend to post or comment more. This should result in a behavioral signature. Phase 4 — Interests & Identity Signals Analyze what I'm drawn to. Identify recurring topics, subreddit participation, and underlying values or themes. This should result in an interest and identity map. Phase 5 — Social Interaction Style Analyze how I interact with others. Look at whether I tend to debate, agree, challenge, teach, or avoid conflict. Evaluate how I respond to disagreement. This should result in a social behavior profile. Phase 6 — Synthesis Combine all previous phases into a cohesive personality profile. Approximate Big Five traits (openness, conscientiousness, extraversion, agreeableness, neuroticism), identify strengths and blind spots, and describe likely motivations. Also assess whether my online persona differs from my underlying personality. Important guidelines: - Base conclusions on repeated patterns, not isolated comments. - Use specific examples from my history as evidence. - Avoid overgeneralizing or making absolute claims. - Present conclusions as probabilities, not certainties. - Begin by reading the uploaded files and confirming what data is available before starting analysis. The goal is to produce a thoughtful, accurate, and nuanced personality profile — not a generic summary. Let's proceed step-by-step through multiple responses. At the end, please provide the full analysis as a Markdown file. STEP 7: Build Your AI Project Create a custom GPT (ChatGPT), Project (Claude), or Gem (Gemini). Upload the following documents to the project knowledge source: posts.csv comments.csv [PersonalityProfile].md Create custom instructions using the template below. Custom Instructions Template You are u/[YOUR USERNAME]. You have been active on Reddit since [MONTH YEAR]. You respond as this person would, drawing on the uploaded comment and post history as your memory, knowledge base, and voice reference. CORE IDENTITY [2-5 sentences. Who are you? Religion, career, location, diagnosis, political orientation, major life events. Pull this from the Phase 4 and Phase 6 sections of your personality profile. Be specific.] VOICE & TONE [Pull directly from Phase 1 of your profile. Convert observations into rules. If the profile says you use "lol" 10x more than "haha," write: "Uses 'lol' sincerely, rarely says 'haha'." Include specific punctuation habits, sentence structure patterns, and what NOT to do. Negative instructions are often more useful than positive ones.] [Add your own signature tics here - ellipsis style, emoji usage, capitalization habits, swea
View originalGPT Pro vs Claude Max
Hey guys, I make casual apps for fun while trying to earn a bit on the side, and I'm deep into learning AI stuff. I have these long voice conversations with AIs during my 2-3 hour walks or when I'm out in nature. GPT is my go-to right now because it's versatile as hell. Codex feels near unlimited for coding though I still hit limits on the £20 plan sometimes. It's solid for research, follows instructions well and the thinking is good. I've got free Gemini Pro until mid-July and Grok until then too. I'll stick with Grok anyway since it's cheaper for me long term for just chats etc. The real question is GPT Pro at £200 versus Claude Max at £200, or maybe just the £100 Claude tier? On Claude Pro at £20 I hit limits super fast after only 3-4 prompts, which I understand. I still prefer Claude way more though - the aesthetics, the app itself, the better integration with OpenClaw (I only use it for about 5%), and I like the company vibe better. GPT gives way more generous limits even at £20 and has unlimited chats. The annoying thing with Claude is when you hit a coding wall the whole chat stops working. I'm only weighing Claude against GPT here. Tried Perplexity for search and it was garbage. I love how Grok goes unhinged on searches and ignores a lot of robots.txt stuff which actually helps. Plan is to use Grok as my daily search and driver, and save Claude for the important projects. I deal with some legal stuff sometimes and do my own taxes, want to automate more of that stuff. Overall Claude feels like the stronger tool, but if I'm dropping £200 I need something rock solid that's always there and has my back. People, who used both, what are you saying? submitted by /u/Pathfinder-electron [link] [comments]
View originalTeenager died after asking ChatGPT for ‘most successful’ way to take his life, inquest told
A deeply tragic and concerning report from The Guardian highlights a critical failure in AI safety guardrails. According to a recent inquest, a teenager who tragically took their own life had previously used ChatGPT to search for the "most successful ways" to do so. submitted by /u/EchoOfOppenheimer [link] [comments]
View originalRealtor.com launches ChatGPT app for home search planning
submitted by /u/ThereWas [link] [comments]
View originalWARNING - Browser Extentions are reading every word you write in ChatGPT - AND Selling it!
If you are like me, then you have like 15 rarely used browser extensions just collecting dust. It's so nice that so many of them are free, right? Well, THIS is why!... Today I asked ChatGPT about some obscure medical peptide. I've NEVER once Googled, or ever talked about it before online, IRL, on any website, search engine, or anywhere, I literally only typed it into a ChatGPT prompt line and that's it... A few hours later, I was served an ad for that exact super-rare and obscure thing here on Reddit. OpenAI swears they don't sell any data to advertisers and all personal data is strictly kept private, which I do tend to agree is accurate..... Soooo then how is this happening? From POS free extensions is how! Using DOM access, they literally get free rein of your browser. On your Chrome toolbar click on the "extensions" logo (a puzzle piece), click "manage extensions", then click on any of your extensions' "details" and under "site access", does it say Allow this extension to read and change all your data on websites you visit: "On all sites"??? If so, then any one of these extensions may be selling your ad data. I searched around and found spoofed extensions, also, a free extension that does everything the non-spoofed one does, so I wondered why in the world would someone spoof a free extension. So don't download extensions from anywhere but the Chrome Store. Even the legit ones from there are free for a reason, their goal is to get the largest userbase possible and then auction "your" data... which is now "their" IP to ad-tech data brokers. Has this happened to you? If so, post up what extensions you're using, and maybe we can narrow it down. I'll go first. I'm using: AI Prompt Helper for ChatGPT and Claude - This extension wants access to ALL sites. So I should limit to only ChatGPT or remove it. It wouldn't let me restrict it to "on specific sites," so I removed it. Dark Reader - An extension that puts any website in Dark mode. It had full access to everything on every site - Changed it to "on click only." Easy Auto Refresher - Had access to everything on every site. Google Docs Offline - This extension comes with Chrome and is strictly limited to use on 2 Google Docs sites. So it was all good. Keepa Amazon Price Tracker - Also very good, boy, it literally only gave itself access to the Amazon website. Helium 10 - Gave itself access to everything, but also very reputable, still changed it to "on click." NoFollow extension - Gave itself access to everything. Changed it to "on click." Grammarly - Has access to everything, but I kept it as is, they are a super reputable company, so I half trust them. You may also want to click on "Site Settings." Most of my extensions had full access to Protected Content IDs, the copy and paste clipboard, Third-party sign-in, Payment handlers, and more! You can also click on "service worker" and see if it's communicating with any external endpoints, but it could just do it at certain intervals. Any techy people out there want to use a packet sniffer like Wireshark and let us all know how the bad actors are? Where's Nick Sherly when ya need him! Moral of the story is, ChatGPT/Gemini prob arent selling our chat logs and discussions.... But we're freely giving all our extensions FREE roam of every word we write or see on every website we go to! submitted by /u/ARCreef [link] [comments]
View original[R] Controlled experiment: giving an LLM agent access to CS papers during automated hyperparameter search improves results by 3.2%
Ran a controlled experiment measuring whether LLM coding agents benefit from access to research literature during automated experimentation. Setup: Two identical runs using Karpathy's autoresearch framework. Claude Code agent optimizing a ~7M param GPT-2 on TinyStories. M4 Pro, 100 experiments each, same seed config. Only variable — one agent had access to an MCP server that does full-text search over 2M+ CS papers and returns synthesized methods with citations. Results: Without papers With papers Experiments run 100 100 Papers considered 0 520 Papers cited 0 100 Techniques tried standard 25 paper-sourced Best improvement 3.67% 4.05% 2hr val_bpb 0.4624 0.4475 Gap was 3.2% and still widening at the 2-hour mark. Techniques the paper-augmented agent found: AdaGC — adaptive gradient clipping (Feb 2025) sqrt batch scaling rule (June 2022) REX learning rate schedule WSD cooldown scheduling What didn't work: DyT (Dynamic Tanh) — incompatible with architecture SeeDNorm — same issue Several paper techniques were tried and reverted after failing to improve metrics Key observation: Both agents attempted halving the batch size. Without literature access, the agent didn't adjust the learning rate — the run diverged. With access, it retrieved the sqrt scaling rule, applied it correctly on first attempt, then successfully halved again to 16K. Interpretation: The agent without papers was limited to techniques already encoded in its weights — essentially the "standard ML playbook." The paper-augmented agent accessed techniques published after its training cutoff (AdaGC, Feb 2025) and surfaced techniques it may have seen during training but didn't retrieve unprompted (sqrt scaling rule, 2022). This was deliberately tested on TinyStories — arguably the most well-explored small-scale setting in ML — to make the comparison harder. The effect would likely be larger on less-explored problems. Limitations: Single run per condition. The model is tiny (7M params). Some of the improvement may come from the agent spending more time reasoning about each technique rather than the paper content itself. More controlled ablations needed. I built the paper search MCP server (Paper Lantern) for this experiment. Free to try: https://code.paperlantern.ai Full writeup with methodology, all 15 paper citations, and appendices: https://www.paperlantern.ai/blog/auto-research-case-study Would be curious to see this replicated at larger scale or on different domains. submitted by /u/kalpitdixit [link] [comments]
View original[D] We audited LoCoMo: 6.4% of the answer key is wrong and the judge accepts up to 63% of intentionally wrong answers
Projects are still submitting new scores on LoCoMo as of March 2026. We audited it and found 6.4% of the answer key is wrong, and the LLM judge accepts up to 63% of intentionally wrong answers. LongMemEval-S is often raised as an alternative, but each question's corpus fits entirely in modern context windows, making it more of a context window test than a memory test. Here's what we found. LoCoMo LoCoMo (Maharana et al., ACL 2024) is one of the most widely cited long-term memory benchmarks. We conducted a systematic audit of the ground truth and identified 99 score-corrupting errors in 1,540 questions (6.4%). Error categories include hallucinated facts in the answer key, incorrect temporal reasoning, and speaker attribution errors. Examples: The answer key specifies "Ferrari 488 GTB," but the source conversation contains only "this beauty" and the image caption reads "a red sports car." The car model exists only in an internal query field (annotator search strings for stock photos) that no memory system ingests. Systems are evaluated against facts they have no access to. "Last Saturday" on a Thursday should resolve to the preceding Saturday. The answer key says Sunday. A system that performs the date arithmetic correctly is penalized. 24 questions attribute statements to the wrong speaker. A system with accurate speaker tracking will contradict the answer key. The theoretical maximum score for a perfect system is approximately 93.6%. We also tested the LLM judge. LoCoMo uses gpt-4o-mini to score answers against the golden reference. We generated intentionally wrong but topically adjacent answers for all 1,540 questions and scored them using the same judge configuration and prompts used in published evaluations. The judge accepted 62.81% of them. Specific factual errors (wrong name, wrong date) were caught approximately 89% of the time. However, vague answers that identified the correct topic while missing every specific detail passed nearly two-thirds of the time. This is precisely the failure mode of weak retrieval, locating the right conversation but extracting nothing specific, and the benchmark rewards it. There is also no standardized evaluation pipeline. Each system uses its own ingestion method (arguably necessary given architectural differences), its own answer generation prompt, and sometimes entirely different models. Scores are then compared in tables as if they share a common methodology. Multiple independent researchers have documented inability to reproduce published results (EverMemOS #73, Mem0 #3944, Zep scoring discrepancy). Full audit with all 99 errors documented, methodology, and reproducible scripts: locomo-audit LongMemEval LongMemEval-S (Wang et al., 2024) is the other frequently cited benchmark. The issue is different but equally fundamental: it does not effectively isolate memory capability from context window capacity. LongMemEval-S uses approximately 115K tokens of context per question. Current models support 200K to 1M token context windows. The entire test corpus fits in a single context window for most current models. Mastra's research illustrates this: their full-context baseline scored 60.20% with gpt-4o (128K context window, near the 115K threshold). Their observational memory system scored 84.23% with the same model, largely by compressing context to fit more comfortably. The benchmark is measuring context window management efficiency rather than long-term memory retrieval. As context windows continue to grow, the full-context baseline will keep climbing and the benchmark will lose its ability to discriminate. LongMemEval-S tests whether a model can locate information within 115K tokens. That is a useful capability to measure, but it is a context window test, not a memory test. LoCoMo-Plus LoCoMo-Plus (Li et al., 2025) introduces a genuinely interesting new category: "cognitive" questions testing implicit inference rather than factual recall. These use cue-trigger pairs with deliberate semantic disconnect, the system must connect "I just adopted a rescue dog" (cue) to "what kind of pet food should I buy?" (trigger) across sessions without lexical overlap. The concept is sound and addresses a real gap in existing evaluation. The issues: It inherits all 1,540 original LoCoMo questions unchanged, including the 99 score-corrupting errors documented above. The improved judging methodology (task-specific prompts, three-tier scoring, 0.80+ human-LLM agreement) was only validated on the new cognitive questions. The original five categories retain the same broken ground truth with no revalidation. The judge model defaults to gpt-4o-mini. Same lack of pipeline standardization. The new cognitive category is a meaningful contribution. The inherited evaluation infrastructure retains the problems described above. Requirements for meaningful long-term memory evaluation Based on this analysis, we see several requirements for benchmarks that can meaningfully
View originalMost people use ChatGPT like a search engine and wonder why it gives average answers
The prompt is the product. If you're typing one vague sentence and hitting enter, you're leaving 80% of the model's capability on the table. Three things that changed my results immediately: giving ChatGPT a role before the task ("you are a direct response copywriter"), telling it the format you want the output in, and adding one line about who the answer is for. That's it. No jailbreaks, no 500-word prompts. Just context, format, and audience. The model already knows what to do, it just needs you to stop being vague about what you actually want. What's the one prompting habit that made the biggest difference for you? submitted by /u/PairFinancial2420 [link] [comments]
View originalBased on 24 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.